perm filename KNOWL.TEX[AM,DBL]2 blob sn#398086 filedate 1978-11-27 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00026 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00003 00002	\NSECP{CONCEPTS}
C00008 00003	\SSEC{Motivation and Overview}
C00011 00004	 \SSSEC{A Glimpse of a Typical Concept}
C00019 00005	 \SSSEC{The main constraint: Fixed set of facets}
C00029 00006	 \SSSEC{BEINGs Representation of Knowledge}
C00033 00007	\SSEC{Facets}
C00039 00008	 \SSSEC{Generalizations/Specializations}
C00050 00009	 \SSSEC{Examples/Isa's}
C00065 00010	 \SSSEC{In-Domain-of/In-Range-of}
C00080 00011	 \SSSEC{Views}
C00091 00012	 \SSSEC{Intuitions}
C00100 00013	 \SSSEC{Analogies}
C00110 00014	 \SSSEC{Conjec's}
C00130 00015	 \SSSEC{Definitions}
C00147 00016	 \SSSEC{Algorithms}
C00165 00017	 \SSSEC{Domain/Range}
C00179 00018	 \SSSEC{Worth}
C00188 00019	 \SSSEC{Interest}
C00203 00020	 \SSSEC{Suggest}
C00212 00021	 \SSSEC{Fill/Check}
C00224 00022	 \SSSEC{Other Facets which were Considered}
C00231 00023	\SSEC{AM's Starting Concepts}
C00233 00024	 \SSSEC{Diagram of Initial Concepts}
C00237 00025	 \SSSEC{Summary of Initial Concepts}
C00264 00026	 \SSSEC{Rationale behind Choice of Concepts}
C00273 ENDMK
C⊗;
\NSECP{CONCEPTS}

This chapter contains material about AM's anatomy:
the knowledge AM starts with, and how that knowledge is represented.

\SSEC{Motivation and Overview}

Each concept  consists  merely of  a bundle  of facets.   The  facets
represent  the  different  aspects  of  each  concept, the  kinds  of
questions one might want to ask about the concept:

\yskip

\parindent 0pt

$\bullet $ \ How valuable is this concept?

$\bullet $ \ What is its definition?

$\bullet $ \ If it's an operation, what is legally in its domain?

$\bullet $ \ What are some generalizations of this concept?

$\bullet $ \ How can you separate the  interesting instances of this concept  from
the dull ones?

\parindent 19pt

\vdots

\yyskip

\noindent Since each concept  represents a mathematical entity, the  kinds of questions
one might  ask are fairly constant from concept to concept.  This set
of questions might change  somewhat for a new domain  of concept.

One ``natural" representation for a concept in LISP is  as  a
set of attri\-bute/val\-ue pairs. That is, each  concept is maintained as
an  atom with a  property list. The  {\sl names} of  the properties (Worth,
Definitions,  Domain/Range,   Generalizations,   Interestingness, etc.)
correspond  to  the  questions  above, and  the  {\sl value} stored  under
property F of atom C is simply the value of the F-facet of the C-concept.
This value can also be viewed as the answer which expert C would give, if asked
question F. Or, it can be viewed as the
contents of slot F of frame C.

 \SSSEC{A Glimpse of a Typical Concept}

As an example, here is a stylized rendition of the SETS concept. This
is a  concept which is meant to correspond to  the notion of a set of
elements.  
For example, according to
the concept shown below, ``Singleton" is one entry on  the Specializations facet
of Sets ---  i.e., singletons are specific kinds of sets..

\yskip 
            
\2

\han2{Name(s): {\it Set, Class, Collection}}

\han2{Definitions: {\it }}

\han3{Recursive: {\it $\lambda $ (S) [S=$\{\}$ or Set.Definition (Remove(Any-member(S),S))] }}

\han3{	Recursive quick: {\it $\lambda $ (S) [S=$\{\}$ or Set.Definition (CDR(S))] }}

\han3{	Quick: {\it $\lambda $ (S) [Match S with $\{\ldots \}$ ] }}

\han2{Specializations: {\it Empty-set, Nonempty-set, Set-of-structures, Singleton }}

\han2{Generalizations: {\it Unordered-Structure, No-multiple-elements-Structure }}

\han2{Examples: {\it }}

\han3{Typical: {\it $\{\{\}\},\ \{A\},\  \{A,B\},\ \{3\}$ }}

\han3{Barely: {\it $\{\}, \ \{A, B, \{C, \{ \{ \{ A, C, (3,3,3,9), <4,1,A,\{B\},A>\}\}\}\}\}$}}

\han3{	Not-quite: {\it $\{A,A\}, (), \{B,A\}$}}

\han3{	Foible: {\it <4,1,A,1> }}

\han2{Conjec's: {\it All unordered-structures are sets. }}

\han2{Intu's: {\it }}

\han3{	Geometric: {\it Venn diagram.\foo{ See [Venn 89], or [Skemp 71].  } }}

\han2{Analogues: {\it Bag, List, Oset }}

\han2{Worth: {\it 600 }}

\han2{View: {\it }}

\han3{	Predicate: {\it $\lambda (P) \{x\,\epsilon \, Domain(P)\ \relv \ P(x)\}$ }}

\han3{	Structure: {\it $\lambda $(S)\  Enclose-in-braces(Sort(Remove-multiple-elements(S))) }}

\han2{Suggest: {\it If P is an interesting predicate over X, 
consider $\{x\,\epsilon \,X \ \relv\  P(x)\}$.}}

\han2{In-domain-of: {\it Union, Intersection, Set-difference, Set-equality, Subset, Member }}

\han3{In-range-of: {\it Union, Intersection, Set-difference, Satisfying }}

\yskip

\noindent \1 To decipher the Definitions facet, 
there are a few things you must know.
An expression of the form ``($\lambda $ (x) E)" is called a Lambda expression
after Church\foo{
Before and during Church, it's called a function. See [Church 41]. },
and may be considered an executable procedure.
It accepts one argument, binds the variable ``x" 
to the value of that argument, and then
evaluates ``E" (which is probably some expression involving the variable x).
For example, ``($\lambda $ (x) (x+5))" is a function which
adds 5 to any number; if given the argument 3, this lambda expression will return
the value 8.

The second thing you must know is that facet F of concept C will occasionally
be abbreviated as C.F. In those cases where F is ``executable", the notation
C.F will refer to applying the corresponding function.
So the first entry in the Definitions facet is recursive because it contains
an embedded call on the function Set.Definition.
Notice that we are implying that the \4name\0 
of that lambda expression itself is ``Set.Definition".

There are some unusual implications of this: since there are three separate but
equivalent definitions, AM may choose whichever one it wants when it recurs. 
AM can choose one via a random selection scheme, or always try to recur into the
same definition as it was just in, or perhaps suit its choice to the form of
the argument at the moment. 

For example, one definition might be great for arguments
of size 10 or less, but slow for bigger ones, and another definition might be
mediocre for all size arguments; then AM  should 
use the mediocre definition over and over again, until the
argument becomes small enough, 
and from then on recur only into the fast definition.
Although AM embodies this ``smart" scheme, the little comments necessary to 
see how it does so have be excised from the version shown above;
this will be detailed later in this chapter, in section 2.5.1.8.

All concepts possess executable definitions, though not necessarily effective
ones. They each have a LISP predicate, but that predicate is not guaranteed
to terminate. Notice that the definitions for Sets are all definitions of
finite sets.

 \SSSEC{The main constraint: Fixed set of facets}

One important  constraint on the  representation is  that the set  of
facets be fixed  for all the concepts. 
An additional constraint is that this set of facets not grow, that it be
fixed once and for all.
So there is one unchanging, universal list
of  two  dozen types  of  facets.   Every  facet of every concept
\4must\0 have one  of those standard names.  All  concepts which have
some examples  must store them as entries on a facet called Examples;
they  can't  call  them  Instances,  or  Cases,   or  G00037's.  This
constraint  is known  as the  ``\6Beings\0 constraint"\foo{  See [Lenat 75b].
Historically, each concept module  was called a  ``BEING". }, and  has
three important consequences:

1. OUTLINE:  First,  it  provides  a  nice,  distributed,  universal
framework  on  which to  display  all  that is  known  about  a given
concept.    
AM can  instantly tell what  facets are not  yet
filled  in for any given concept,  and this will  in turn  suggest new
tasks to perform.  In other  words, this constraint helps define  the
``space" which  AM must explore,  and makes it  obvious what  parts of
each concept have and have not yet been investigated.

2. STRUCTURE: The constraint  specifies that  there be a  \4set\0 of
facets, not just  one. This set  was made large  enough that all  the
efficiency advantages of a  ``structured" representation are preserved
(unlike totally uniform representations, e.g. pure production systems
with simple memories as data structures, or predicate calculus).

3. UNIFORMITY:  When AM wishes  a piece of information, it must ask a
concept  a ``question"  --- i.e., mention  a particular  facet by name.
The  benefit of the \6Beings\0  constraint is that there  is a fixed,
small repertoire of questions  it may ask, hence there will be
no long searching, no  misunderstandings.  This is the same advantage
that always accrues when everyone uses a common language.


We shall illustrate the last two advantages by using the Sets concept
pictured above.  How does AM handle a task of
this form: ``\6Check examples of Sets\1''?  AM accesses the Examples
facet of the Sets concept, and obtains a bunch of items which are all
probably sets.  If any isn't a set, AM would like to make it  one, if
that involves  nothing difficult. AM locates  all the generalizations
of Sets\foo{ by ``rippling" upward from Sets, in the Genl direction}, and
comes    up    with    the    list    <Sets,    Unordered-Structures,
No-multiple-elements-Structures,  Structures,  Objects,  Any-concept,
Anything>. Next, the ``Check" facet of each of these is examined, and
all its heuristics  are collected.   For example, the Check  facet of
the  No-multiple-elements-Structures concept  contains  the following
entry: ``Eliminate multiple  occurrences of  each element" (of  course
this is  present not as  an English sentence  but rather as  a little
LISP  function).  So  even though Sets  has no entries  for its Check
facet, several little functions  will be gathered up by  the rippling
process. Each  potential set would be subjected  to all those checks,
and might be modified or discarded as a result.

There is  enough  ``structure"  around to  keep  the  heuristic  rules
relevant to this task  isolated from very irrelevant rules,  and there
is enough ``uniformity" to make finding those rules very easy.

The same rippling would be done to find predicates which tell whether
a  set is  interesting  or  dull.  For  example,  one  entry  on  the
Interestingness facet of the Structure  concept says that a structure
is  interesting  if  all  pairs  of  members  satisfy  the same  rare
predicate P(x,y)  [for any such  P].  So  a set,  all pairs of  whose
members satisfy ``Equality," would be considered interesting. In fact,
every Singleton is an interesting \4Structure\0 for just that reason.
A singleton might be an interesting \4Anything\0 because it takes only
a few characters to type it out (thereby satisfying a criterion on
Anything.Interest).

To locate all the specializations of  Sets, the rippling would go  in
the  opposite direction.  For  example, one  of  the entries  on  the
Specializations  facet of Sets  is Set-of-structures;  one if \4its\0
Specialization entries is Set-of-sets. So this latter concept will be
caught in the net when rippling away from Sets in the Specializations
direction.

If  AM wants lots of examples  of sets, it has  only to ripple in the
Specializations direction,  gathering  Examples  of each  concept  it
encounters.  Examples of Sets-of-sets (like this one: $\ \{\{A\},\{\{C,D\}\}\}$)
will be caught in this way, as will examples of Sets-of-numbers  (like
this  one: $\ \{$1,4,5$\}$),   because  two  specializations  of   Sets  are
Sets-of-Sets  and Sets-of-Numbers\foo{ We  are assuming that  AM has run
for some time,  and already discovered  Numbers, and already  defined
Sets-of-Numbers. }.

In addition to the three main reasons for keeping the set of facets the same
for all the concepts (see previous page), we claimed there were also reasons 
for keeping that set
fixed once and for all.
Why not dynamically enlarge it?
To add a new facet, its value has to be filled in for lots of concepts.
How could AM develop the huge body of heuristics needed to guide such
filling-in and checking activities? Also, the number of facets is small
to begin with because people don't seem to use more than a few tens of
such ``properties" in classifying knowledge about a concept\foo{ This data
is gathered from introspection by myself and a few others, and should probably
be tested by performing some psychological experiments. }.
If the viability of AM seemed to depend on this ability, I would have worked
on it. AM got along fine without being able to enlarge its set of facets, so
no time was ever spent on that problem.
I leave it as a challenging, ambitious ``open research problem".

 \SSSEC{BEINGs Representation of Knowledge}

Before  discussing each  facet  in detail,  let's  interject a  brief
historic   digression,  to  explain  the   origins  of  this  modular
representation scheme.

The ideas arose  in an automatic  programming context, while  working
out  a solution  to the  problem  of constructing  a computer  system
capable  of  synthesizing  a  simple  concept-discrimination  program
(similar to [Winston 70]).   The scenario  envisioned was one of  mutual
cooperation  among  a  group of  a  hundred  or  so experts,  each  a
specialist in  some  minute  detail  of  coding,  concept  formation,
debugging,  communicating, etc.    Each expert  was  modelled by  one
module,  one BEING. Each BEING  had the same kinds of slots (parts,
facets), and each slot was  interpreted as a \4question\0 which  that
BEING  could  answer.    The   community  of  experts  carried  on  a
round-table discussion of a programming task which was specified by a
human user.  Eventually, by  cooperating and  answering each  other's
questions,  they hammered  out the  program he  desired.   See [Lenat
75b] for details.

The final system, called PUP6, did actually synthesize several  large
LISP  programs,  including  many  variants  of  the  concept-learning
program.   This is described in detail in  [Lenat 75a].  Unfortunately,
PUP6 had  virtually no  natural language  ability  and was  therefore
unusable by an untrained human. Its modal output was ``\4Eh?\1''.

The  search  for  a  new  problem  domain  where  this  communication
difficulty  wouldn't be so severe led  to consideration of elementary
mathematics.

The other main defect of PUP6 was its narrowness,  the small range of
`target' programs which could  be synthesized. PUP6 had been designed
with just one target in mind, and  almost all it could do was to  hit
that target.  The  second constraint on the new task  domain was then
one  of having a non-specific  target, a very broad  or diffuse goal.
This   pointed  to   an   automated   researcher,   rather   than   a
problem-solver.

These  two constraints  then  were (i)  elementary  math, because  of
communication  ease, and (ii) self-guided exploration, because of the
danger of too specific a goal.  Together, they directed the author to
an investigation which ultimately resulted in the AM project.

\SSEC{Facets}

How \4is\0 each concept
represented?
Without claiming that this is the ``best" or preferred scheme, this section
will treat in detail AM's representation of this knowledge.

We have  seen that  the representation  of a  concept can loosely  be
described as a  collection of facet/value pairs, where the facets are
drawn from a fixed set of about 25 total possible facets.

The facets break down into three categories:


\noindent $\bullet $ \ Facets  which  relate  this  concept  C  to  some  other  one(s):
Generalizations,  Specializations,   Examples,  Isa's,  In-domain-of,
In-range-of, Views, Intu's, Ana\-lo\-gies, Conjec's

\noindent $\bullet $ \ Facets which  contain  information intensive  to this  concept C itself:
Definitions, Algorithms, Domain/Range, Worth, Interest

\noindent $\bullet $ \ Sub-facets, containing heuristics, which can be tacked onto facets from either
group above. These include: Suggest, Fillin, Check (and might be extended
to include Justification, Origin, and other fields)

\yskip

Some facets  come in  several flavors  (e.g., there  are really  four
separate  facets  --- not  just one  --- which  point  to Examples:  boundary,
typical, just-barely-failing, foibles).


This section will cover each facet  in turn.  Let's begin by  listing
each of  them. For a  change of pace,  we'll show a  typical question
that each one might answer about concept C:\foo{ In this discussion, ``C"
represents the name  of the concept whose  facet is being  discussed,
and may be read ``the given concept". }

\yskip

\hangindent 30pt Name: What shall we call C when communicating with the user?

\hangindent 30pt Generalizations: Which other concepts have less restrictive definitions than C?

\hangindent 30pt Specializations: Which concepts have C's definition plus some additional 
constraints?

\hangindent 30pt Examples: What are some things that satisfy C's definition?

\hangindent 30pt Isa's: Which concepts' definitions does C itself satisfy?\foo{ Notice that C will
therefore be an Example of each member of Isa's(C). }

\hangindent 30pt In-domain-of: Which operations can be performed on C's?

\hangindent 30pt In-range-of: Which operations result in values which are C's?

\hangindent 30pt Views: How can we view some other kind of entity as if it were a C?

\hangindent 30pt Intu's: What is an abstract, analogic representation for C?

\hangindent 30pt Analogies: Are there similar (though formally unrelated) concepts?

\hangindent 30pt Conjec's: What are some potential theorems involving C?

\hangindent 30pt Definitions: How can we tell if x is an example of C?

\hangindent 30pt Algorithms: How can we execute the operation C on a given argument?

\hangindent 30pt Domain/Range: What kinds of arguments can operation C be executed on? What kinds of
values will it return?

\hangindent 30pt Worth: How valuable is C? (overall, aesthetic, utility, etc.)

\hangindent 30pt Interestingness:  What special features make a C especially interesting?

\yyskip

\noindent In addition, each facet F of concept C can possess a few little subfacets which
contain heuristics for dealing with that facet of C's:

\yskip

\hangindent 30pt C.F.Fillin: How can entries on C.F be filled in?
These heuristics get called on when the current task is \6``Fillin
facet F of concept X"\1, where X is a C.

\hangindent 30pt C.F.Check: How can potential entries on C.F be checked and patched up?

\hangindent 30pt C.F.Suggest: If AM gets bogged down, what are some new tasks (related to C.F) 
it might consider?

 \SSSEC{Generalizations/Specializations}

\qq{Generalization makes possible conscious, controlled, and accurate accomodation
of one's existing schemas, not only in response to the demands for assimilation
of new situations as they are encountered, but {\sl ahead} of these demands,
seeking or creating new examples to fit the enlarged concept.}{Skemp}



We say concept A ``\4is a generalization of\1'' concept B iff
every example of B is an example of A. Equivalently, this is true iff
the definition of B can be phrased as ``$\lambda $ (x) [A.Defn(x) {\it and} P(x)]'';
that is, for x to satisfy B's definition, it must satisfy A's definition plus
some additional predicate P.
The Generalizations facet of concept C will be abbreviated as C.Genl.

C.Genl does not contain \4all\0 generalizations of 
C; rather, just the ``immediate" ones. More formally, if A is a generalization
of B, and B of C, then C.Genl will \4not\foo{The 
reason for these strange constraints is so that the total number of links
can be minimized. There is no harm if a few redundant ones sneak in.
In fact, frequently-used paths are granted the status of single links, as we
shall soon see.} \0  contain a pointer to A.
Instead, C.Genl will point to B, and B.Genl will point to A.

\yskip

\noindent
Here are the recursive equations which permit a search process to quickly find
all generalizations or specializations of a given concept X:


Generalizations({\it X}) 
 $ =↓{df} \   Genl↑*(X) =↓{df} \  \{X\} \union Generalizations(X.Genl) $

Specializations({\it X}) 
 $ =↓{df} \   Spec↑*(X) =↓{df} \  \{X\} \union Specializations(X.Spec) $

\yskip

\noindent For the reader's convenience, here are the
similar equations, presented elsewhere in the text, 
for finding all examples of ---  and Isa's of --- X:

\yskip

Examples({\it X}) $ =↓{df} \  Spec↑*(Exs(Spec↑*(X))) $

Isa's({\it X}) $ =↓{df} \  Genl↑*(Isa(Genl↑*(X))) $

\yyskip

\noindent  The {\it format}
of the Generalizations facet is quite simple: it is a list of 
concept names. The Generalizations facet for Odd-primes might be:

(Odd-numbers    Primes)


We've been talking about both Specializations and Generalizations as if they were 
very similar to each other. It's time to make that more explicit:

Specializations are the converse of Generalizations. The format is the same, and
(hopefully) A is an entry on B's Specializations facet iff B is an entry on
A's Generalizations facet. 

The uses of these two facets are many:


\noindent  $\bullet $\ AM can sometimes establish independently that 
A is both a generalization and
a specialization of B; in that case, AM would like to 
recognize that fact easily, so it can
conjecture that A and B specify
equivalent concepts. Such coincidences are easily detected as {\it cycles}
in the
graph of all Genl (or Spec) relations known to AM.
In these cases, AM may physically merge A and B 
(and all the other concepts in the cycle)
into one concept.

\noindent  $\bullet $\ Sometimes, AM 
wants to assemble a list of all specializations (or 
generalizations) of X, so that
it can test whether some statement which is just barely true (or false) for X
will hold for any of those specializations of X.

\noindent  $\bullet $\ Sometimes, 
the list of generalizations is used to assemble a list of 
isa's; similarly, the list of specializations helps assemble a list of examples.

\noindent  $\bullet $\ A common and crucial use of the list of 
generalizations is to locate all the
heuristic rules which are relevant to a given concept. Typically, the relevant
rules are those tacked onto Isa's of that concept, and the list of Isa's is built up
from the list of generalizations of that concept.

\noindent  $\bullet $\ To incorporate new knowledge. If AM learns,
conjectures, etc. that A is a 
specialization of B, then all the machinery (all the theorems, algorithms, etc.)
for B become available for working with A. 

\yyskip

Here is a little trick that
deserves a couple paragraphs of its own. AM stores the answers to common questions
(like ``What are \4all\0 the specializations of Operation") explicitly,
by intentionally permitting redundant links to be maintained.
If two requests arrive closely in time, to test whether A is a generalization
of B, then the result is stored by adding ``A" 
as an entry on the Generalizations facet of
B, and adding ``B" as a new entry on 
the Specializations facet of A. 
The slight extra space is more than recompensed in cpu time saved.
Computer scientists will perceive the analogy between this redundant
storage (to save later re-computation) and the well-known hardware technique of
{it caching}.

If the result were False (A turned out not to be a generalization of B) then the
links specify that finding explicitly,
so that the next request does not generate a long search again.
Such failures are recorded on two additional facets: Genl-not and Spec-not.
Since {\sl most} concept pairs A/B are related by Spec-not and by Genl-not,
the only entries which get recorded here are the ones which were frequently
called for by AM. If space ever gets tight, all such redundant links can be
discarded with no permanent damage done. 

These two ``shadow" facets (Genl-not and Spec-not) are not useful or interesting
in their own right.
If AM ever wished to know all
the concepts which are \4not\0 generalizations of C, the fastest way would
be to take the set-difference of $\{$all concepts$\}$ and Generalizations(C).
Since they are quite incomplete, Genl-not and Spec-not are used more like a cache
memory: they save time whenever they are applicable, and don't really cost
much when they aren't applicable.

 \SSSEC{Examples/Isa's}

\qq{Usually,  to show  that  a definition  implies  no contradiction,  we
proceed  {\sl by example},  we try  to make an  {\sl example} of  a thing
satisfying the definition. We wish to  define a notion A, and we  say
that, by  definition, an A  is anything for which  certain postulates
are true. If we can demonstrate directly that all these postulates are
true of a  certain object  B, the definition  will be justified;  the
object B will be an {\sl example} of an A.}{Poincar\'e}


Following Poincar\'e, we say ``\4concept A is an example of concept B\1'' iff
A satisfies B's definition.\foo{
What does this mean? B.Defn is a Lisp predicate, a Lambda expression.
If it is fed A as its argument, and it returns True, we say that
A is a B, or that A satisfies B's definition. If B.Defn returns NIL,
we say that A is not a B, or that A fails B's definition. If B.Defn
runs out of time before returning a T/NIL value, there is no definite
statement of this form we can make. 
} Equivalently, we say that ``\4A isa B\1''.
It would be legal (in that situation) for ``A" to be an entry on
B.Exs (the Examples facet of concept B) and for ``B" to be an entry on
A.Isa (the Isa's facet of concept A).

The Examples facet of C does not contain \4all\0 examples of 
C; rather, just the ``immediate" ones. 
The examples facet of Numbers will not contain ``11" since it
is contained in the examples facet of Odd-primes.
A ``rippling" procedure is used to acquire a list of all examples of a given
concept. The basic equation is:


\noindent Examples({\it X}) 
$ =↓{df} \  Specializations(Exs(Specializations(X))) $

where Exs(x) is the contents of the examples facet of x.
Examples(x) represents the final list of all known items which satisfy
the definition of X. Examples(x) thus must include Exs(x).
Specializations(x) =  Spec$↑*$(x); i.e.,
all members of x.Spec, all members of \4their\0 Spec facet, etc.
Note the similarity of this to the formula for Isa's(x), given on the last
page.


As with Generalizations/Specializations, the reasons behind the
incomplete pointer structure is simply to save space, and to minimize the
difficulty of updating the graph structure whenever new links are found.
Suppose a new Mersenne prime
is computed. Wouldn't it be nice simply to add a
single entry to the Exs facet of Mersenne-primes, rather than to have to update the
Exs pointers from a dozen concepts?
There is no harm if a few redundant links sneak in.

\yskip

``Isa's" is the converse of ``Examples". The format is the same, and
(if A and B are both concepts) A is an entry on B.Isa iff B is an entry on
A.Exs. In other words, A is a member of Examples(B) iff B is a member of
Isa's(A). 


The \4uses\0 of the Exs/Isa's facets are similar to those for Genl/Spec 
(see previous subsection), but 
their formats are quite a bit more complicated
(at the
implementation level).
There are really a cluster of different facets all related to Examples:


\noindent  $\bullet $\ TYPICAL: This is a list of average examples. 
Care must be taken to include a wide
spectrum of allowable kinds of examples. 
For ``Sets", these would include sets of varying
size, nesting, complexity, type of elements, etc.

\noindent  $\bullet $\ BOUNDARY: Items which just barely pass the 
definition of this concept. This might
include items which satisfy the base step of a recursive definition, or items which
were intuitively believed to be \4non\1-examples of the concept. 
For ``Sets", this
might include the empty set.

\noindent  $\bullet $\ BOUNDARY-NOT: Items which just 
barely fail the definition. This might include an
item which had to be slightly modified during checking, 
like $\{$A,B,A$\}$ becoming 
$\{$A,B$\}$. 

\noindent  $\bullet $\ FOIBLES: Total failures. 
Items which are completely against the grain of this
concept. For ``Sets", this might include the operation ``Compose".

\noindent  $\bullet $\ NOT: This is the ``cache" trick 
used to store the answers to frequently-asked
questions. If AM frequently wants to know whether X is an example of Y, and
the answer is \4No\1, then much time can be saved by adding X as an entry to the
Exs-not facet of Y.

\yskip

An individual item on these facets may just be a concept name, or it may be
more complicated.
In the case of an operation, it is an item of the form <$a↓1a↓2\ldots \ \→\ v$>;
 i.e., actual
arguments $a↓i$ and the value $v$ returned. In the case of objects, 
it is an object of
that form (e.g., Sets.Exs might contain $\{k,\,r\}$ 
as one entry.)

Here is a more detailed illustration. Consider the
Examples facet of Set-union. It might appear thus:


\han4{TYPICAL: $\{$A$\}\union \{$A,B$\}\ \→\ \{$A,B$\}$;}

\han7{$\{$A,B$\}\union \{$A,B$\}\ \→\ \{$A,B$\}$;}

\han7{$\{$A,<3,4,3>,$\{$A,B$\}\}\union \{$3,A$\}\ 
\→\ \{$A,<3,4,3>,$\{$A,B$\}$,3$\}$.}

\han4{BOUNDARY: $\{\}\union X\ \→\ X $\foo{
Actually, AM is not quite smart enough to use the variable X as shown in the
boundary examples. It would simply store a few instances of this general rule, plus
have an entry of the form <Equivalent: Identity(X) and Set-union(X,$\{\}$)> 
on the
Exs facet of Conjectures.
Notice that because of the asymmetric way Set-union was defined,
only one lopsided boundary example was found. If another definition were supplied,
the converse kind of boundary examples would be found. }}

\han4{BOUNDARY-NOT: $\{$A,B$\}\union \{$A,C$\}\ \→\ \{$A,B,A,C$\}$;}

\han7{$\{$A,B,C,D$\}\union \{$E,F,G,H,I,J$\}\ \→\ \{$A,B,C,E,F,G,H,I,J$\}$}

\han4{FOIBLES: <2,A,2>}

\han4{NOT:  {\it no entries}}

\yyskip

\noindent The format for Isa's are much simpler: 
there are only two kinds of links, and they're each merely a
list of concept names.
Here is the Isa facet of Set-union:

\han4{ISA: (Operation\foo{ This entry is redundant. }  Domain=Range-op)}

\han4{ISA-NOT: (Structure Composition Predicate)}

\yskip

\noindent At some time, some rule asked whether Set-union {\it isa} 
Composition. As a result,
the negative response was recorded by adding ``Composition" to the Isa-not
facet of Set-union, and adding ``Set-union" to the Exs-not subfacet of the
Examples facet of the concept Composition 
(indicating that Set-union was definitely not an
example of Composition, yet there was no reason to consider it a foible).

 \SSSEC{In-Domain-of/In-Range-of}


We shall say that A  is in the domain of B (written  ``A In-dom-of B")
iff


\noindent $\bullet \ $ A and B are concepts

\noindent $\bullet \ $ B isa Operation

\noindent $\bullet \ $  A is equal to (or at least a
specialization of) one of the domain components of the operation B. That is,
B can be executed using any example of A as one of its arguments.

\yskip

Although it can be recomputed very easily, we  may wish to record the
fact  that A  In-dom-of B by  adding the  entry ``B" to  the In-dom-of
facet of  A.    AM  may even  wish  to  add this  new  entry  to  the
Domain/range facet of B (where A is a specialization of the $j↑{th}$
domain component of B):

\noindent 
<$D↓1\ D↓2\ldots \  D↓{j-1} \  A \ D↓{j+1} \ldots \  D↓i \ \→\  R$>.

The semantic  content of  ``In-dom-of" is:  what can  be done  to 
any example of
a given
concept C?   Given an example of concept C,  what operations can be
run on that thing?  E.g.,
``Primes In-dom-of Squaring"  tells us that we can apply the operation
Squaring to any particular prime number we wish.

\yskip

Let us now turn from ``In-dom-of" to the related facet ``In-ran-of."
We say  that concept  A is  in the  range of  B iff  B  is an  Activity
and A is (a specialization of) the range of B. 

For example, Odd-perfect-squares is In-ran-of Squaring, since 
(i) Squaring is an operation, hence an Activity, and (ii) 
one of its Domain/range
entries is <Numbers $\ \→\  $ Perf-squares>, and
Perf-squares is a generalization of Odd-perfect-squares\foo{ Why? Because
Generalizations(Odd-perfect-squares)   is   the   set   of   concepts
$\{$Odd-numbers  Perf-squares Numbers  Objects  Any-concept Anything$\}$,
hence contains Perf-squares. 
}.

Here is what  the In-ran-of facet  of Odd-perfect-squares might  look
like:


\han5{(Squaring Add TIMES Maximum Minimum Cubing)}

\noindent
Each of these operations will --- at least 
sometimes --- produce an odd perfect square
as its result.

Semantically, the  In-ran-of relation  between A and  B means  that one
might be  able to produce examples of 
A by running operation B.  
Aha!
This is a  potential mechanism for finding  examples of a concept  A.
All you need  do is get hold of In-ran-of(A),  and run each of those
operations. Even more expeditious is  to examine the Examples  facets
of each  of those operations,  for already-run examples  whose values
should  be tested using A.Defn,  to see if they  are examples of A's.
AM relies on this  in times of high motivation;  it is too ``blind"  a
method to use heavily all the time.

This facet is  also useful for generating  situations to investigate.
Suppose  that the  Domain/range facet  of Doubling contains  only one
entry: < Numbers $\ \→\  $Numbers >. Then syntactically, Odd-numbers is in the
range of Doubling. Eventually a heuristic rule may have AM spend some
time looking for an example of Doubling, where the result was an  odd
number.  If none is quickly found, AM conjectures  that it \4never\0 will be
found.    Since one  definition  of Odd-number(x)  is  ``Number(x) and
Not(Even-number(x))", the only non-odd  numbers are even numbers.  So
AM will increment the Domain/range facet of Doubling with the entry 
<Numbers$\ \→\ $Even-numbers>, and remove the old entry. Thus Odd-numbers
will no longer  be In-dom-of Doubling.   AM can  of course chance upon
this conjecture  in a more positive  way, by noticing  that all known
examples of Doubling have results which are examples of Even-numbers.\foo{ This
positive approach is in fact the way AM noticed this particular relationship. }.

A more productive result  is suggested by  examining the cases  where
Odd-perfect-squares are the result of cubing.   The smallest such odd
numbers  are 1, 729, and 15625.  In general, these numbers  are all those of
the form ${(2n+1)}↑6$.{\footnote { }{$↑6$ Wrong.
That was an exponent, not a footnote!}}
How could AM notice such an awkward relationship?

The general question to ask, when A  In-ran-of B,
is ``What is the set of domain items whose values (under the operation
B)  are A's?``  In case the  answer is  ``All" or  ``None", some special
modifications can be made  to the Domain/range facets  and In-dom-of,
In-ran-of  facets of various  concepts, and  a new conjecture  can be
printed.  In  other   cases,  a  new   concept  might  get   created,
representing precisely  the set  of all  arguments to  B which  yield
values  in A.   If you will,  this is the  inverse image  of A, under
operation B.  In the case of B a predicate, this might be the  set of
all arguments  which satisfy the  predicate. 

In the case  of B=Cubing
and A=Odd-perfect-squares, the heuristic mentioned above will have
AM create a new concept: 
the inverse image of Odd-perfect-squares under the operation of Cubing.
That is, find numbers whose cubes are Odd-perfect-squares.
It is quickly noticed that such numbers are precisely the set of
Odd-perfect-squares themselves! So 
The  Domain/range facet  of Cubing might get this new entry: 
<Odd-perfect-squares $\ \→\ $Odd-perfect-squares>.
But not all squares can be reached by cubing, only a few of them can.
AM will notice this, and
the new range would then be isolated and might be renamed
by the  user ``Perfect-sixth-powers".   Note that all  this
was   brought    on   by   examining    the   In-ran-of    facet   of
Odd-perfect-squares.  ``Cubing"  was  just one  of  the  seven  
entries there. There are six more stories to tell in this
tiny nook of AM's activities.

How exactly does  AM go about gathering  the In-ran-of and  In-dom-of
lists?  Given a  concept C, AM can scan down the global tree of operations
(the Exs and Spec links below the concept `Active').
For  if C  is not In-dom-of  F, it  certainly won't  be In-dom-of any
specialization of  F. Similarly,  if it can't  {\sl legally } be 
produced  by F,  it
won't  be produced by  any specialization  of F: if  you can't  get x
using Doubling you'll never get it  by Doubling-perfect-numbers. 
So AM simply  ripples
around, as usual. The  precise code for this algorithm  is of little
interest.   There are  not that many  operations, and it  is cheap to
tell whether  X  is  a specialization  of  a  given concept,  so  even  an
exhaustive search wouldn't be  prohibitive. Finally, recall that such
a  search is  not  done all  the time.   It  will be  done initially,
perhaps, but  after that  the In-dom-of and  In-ran-of networks  will
only need slight updating now and then.

 \SSSEC{Views}

Often, two concepts  
A and B will be inequivalent, yet there will be a ``natural"
bijection between one and (a subset of) the other.
 
For example, consider a finite set S of atoms, and consider the
set of all its subsets, 2$↑S$,
also called the \4power set\0 of S, $\Pscr $(S).
Now S is a member of, but not a \4subset\0 of, 2$↑S$
(e.g., if S=$\{x,y,\ldots\}$, then $x$ is not a member of 2$↑S$). 
On the other hand, we can identify or view
S as a subset by the mapping  $v\  →\ \{v\}$. 
Then S is associated with the following
subset of 2$↑S$: $\{ \{x\}, \{y\},\ldots \}$. 
Why would we want to do this? Well, it shows
that S is identified with a \4proper\0 subset of 2$↑S$, and indicates that S has
a lower cardinality (remember: all sets are finite).

As another example, most of us would agree that the set $\{x, \{y\}, z\}$ can be
associated with the following bag: $(x, \{y\}, z)$. 
Each of them can be viewed as
the other. Sometimes such a viewing is not 
perfectly natural, or isn't
really a bijection: how could
the bag (2, 2, 3) be viewed as a set? Is $\{2,3\}$ better or worse than
$\{2,\{2\},3\}$?

The View facet of a concept C describes how to view instances of another concept D
as if they were C's. For example, this entry on the View facet
of Sets explains how to view any given structure as if it were a Set:

\yskip

\noindent \6Structure: \it $\lambda $ (x) Enclose-in-braces(Sort(Remove-multiple-elements(x)))\1

\yskip

\noindent If given the list <z,a,c,a>, this little program would remove
multiple elements (leaving  <z,a,c>), sort the structure
(making it <a,c,z>), and replace the ``<$\ldots $>" 
by ``$\{\ldots \}$", leaving the
final value as $\{$a,c,z$\}$. Note that this transformation is not 1-1 (injective); 
the list <a,c,z> would get transformed into this same set.
On the other hand, it may be more useful than transforming
the original list into $\{$z,$\{$a,$\{$c,$\{$a$\}\}\}\}$ 
which retains the ordering and
multiple element information.
Both of those transformations may be present as entries on the View
facet of Sets.

As it turns out, the View facet of Sets actually contains only the following
information:

\yskip

\noindent \6Structure: \it $\lambda $ (x) Enclose-in-braces(x) \1

\yskip

\noindent Thus the Viewing will produce entities which are not quite sets. 
Eventually,
AM will get around to executing a task of the form ``Check Examples of Sets",
and at that time the error will be corrected.
One generalization of Sets is No-multiple-elements-Structures, and one of its
entries under Examples.Check says to remove all multiple elements. Similarly,
Unordered-structures is a generalization of Sets, and one of its Examples.Check 
subfacet entries
says to sort the structure. If either of these alters the structure, the old structure 
is added to the Boundary-not subfacet (the `Just-barely-miss' kind) of 
Examples facet of Sets.

The syntax of the View facet 
of a concept C
is a list of entries; each entry specifies the name of
a concept, X, and a little program P. If it is desired to view an instance of
X as if it were a C, then program P is run on that X; the result is (hopefully)
a C. The programs P are opaque to AM; they must have no side effects and be quick.

Here is an entry on the View facet of Singleton:

\yskip

\noindent \6Anything: \it $\lambda $ (x) Set-insert(x, $\phi $) \1

\yskip

\noindent In other words, to view anything as a singleton set, 
just insert it into the
empty set. Note that this is also one way to view {\sl anything} as a set. 

\yskip

As you've
no doubt guessed, there is a general formula explaining this:

\yskip

\han5{$Views(X) =↓{df}  View\biglp Specializations(X)\bigrp $}

\yskip

\noindent Thus, 
to find all the ways of viewing something as a C, AM ripples away from C
in the Spec direction, gathering all the View facets along the way.
All of their entries are valid entries for C.View as well.

\yskip

In addition to these built-in ways of using the Views facets, some special
uses are made in individual heuristic rules.
Here is a heuristic rule which employs the Viewing facets of relevant concepts in
order to find some examples of a given concept C:

\yskip


\han2{{\it IF the current task is to Fill-in Examples of C,}}

\han3{{\it and C has some entries on its View facet,}}

\han3{{\it and one of those entries <X,P> indicates a concept 
X which has some known Examples,}}


\han2{{\it THEN run the associated program P on each member of Examples(X),}}

\han3{{\it and add the following task to the agenda: ``Check Examples of C", for the
following reason: ``Some very risky techniques were used to find examples of C",
and that reason's rating is computed as: 
Average(Worth(X), $\|$ the examples of C found in this manner$\|$).}}

\yskip


\noindent Say the task selected from the agenda 
was ``Fill-in Examples of Sets".
We saw that one entry on Sets.View was \6Structure: 
$\lambda $(x) Enclose-in-braces(x)\1.
Thus it is of the form <X,P>, with X=Structure. The above heuristic rule will
trigger
if any examples of Structures are known. The rule will then use the
View facet of Sets to find some examples of Sets.
So AM will go off, gathering all the examples of structures.
Since Lists is a Specialization of
Structure, the computation of Examples(Structures) will eventually ripple
downwards and ask for Examples of Lists. If the Examples facet of Lists
contains the entry <z,a,c,a,a>, then this will be retrieved 
as one of the members of Examples(Structure). The heuristic rule takes each such
member in turn, and
feeds it to Set.View's
little program P. 
In this case, the program replaces the list brackets with set braces, thus converting
<z,a,c,a,a> to $\{$z,a,c,a,a$\}$. 

In this manner, all the existing structures will be converted
into sets, to provide examples of sets.
After all such conversions take place, a great number
of potential examples of Sets will exist. The final action of the right side of the
above heuristic rule is to add the new task ``\6Check examples of Sets\1'' to the
agenda. When this gets selected, all the ``slightly wrong" examples will be fixed
up. For example, $\{$z,a,c,a,a$\}$ 
will be converted to $\{$a,c,z$\}$.

If any reliance is made on those unchecked examples, there is the danger of
incorrectly rejecting a valid conjecture. This is not too serious, since the
very first such reliance will boost the priority of the task
``\6Check examples of Sets\1'', and it would then probably be the very next task 
chosen.

 \SSSEC{Intuitions}

\qq{The mathematician does not work like a machine; we cannot overemphasize
the fundamental role played in his research by a special intuition
(frequently wrong), which is not common-sense, but rather a divination
of the regular behavior he expects of mathematical beings.}{Bourbaki}


This facet  turned out to be a ``dud", and  was later excised from all
the concepts. It will be  described below anyway, for the benefit  of
future  researchers.    Feel  free  to  skip  directly  to  the  next
subsection.

The  initial idea was  to have a  set of a few  (3--10) large, global,
opaque LISP  functions. Each of  these functions  would be termed  an
``Intuition" and would have some suggestive name like ``jigsaw-puzzle",
``see-saw", ``archery", etc.   Each  function would  somehow model  the
particular activity implied by  its name. There would be  a multitude
of  parameters which could  be specified by  the ``caller"  as if they
were the arguments of the function.  The function would then  work to
fill  in values  for  any  unspecified parameters.    That's all  the
function  does.    The  caller  would  also  have  to  specify  which
parameters were to be considered as the ``results" of the function.

For  the  see-saw,  the  caller  might  provide  the  weight  of  the
left-hand-side sitter, and the final position of the see-saw, and ask
for the  weight of  the right-hand  sitter. The  function would  then
compute  that weight  (as  any  random number  greater/less-than  the
left-hand weight,  depending on the desired tilt  of the board).  Or,
the caller  might  specify the  two weights  and  ask for  the  final
position. 

The See-saw function is an expert on this subject; it has
efficient code for computing any values which can be computed, and
for randomly instantiating any variables which may take on any value
(e.g., the first names of the people doing the sitting). When an individual
call is made on this function, the caller is not told how the final values
of the variables were computed, only what those values end up as.


So  the  Intuitions were  to  be  experimental laboratories  for  AM,
wherein it  could get some (simulated) real-world empirical data.  If
the seesaw  were the Intuition  for ``>",  and ``weight" corresponded  to
``Numbers", then  several relationships might  be visualized intuitively
(like the anti-symmetry of ``>"). 

This is a nice idea, but in practice
the only relationships  derived in this  way were the ones  that were
thought  up while  trying to  encode the  Intuition functions.   This
shameful behavior  led  to  the  excision of  the  Intuitions  facets
completely from the system.

As another example, suppose AM is considering composing two relations
R  and S.  If  they have no common  Intuition reference, then perhaps
they're not meaningfully  composable.  If they  do both tie into  the
same  Intuition function,  then  perhaps that  function  can tell  us
something about the composition. 

This is a nice idea, but in practice
very few prunings were accomplished this way, and no unanticipated
combinations were fused. 

\yskip

Each Intuition  entry is like  a ``way  in" to one  of the few  global
scenarios.  It can be characterized as follows:

\yskip


\noindent $\bullet \ $   One  of the  salient  features of  these  
entries ---  and  of the
scenarios --- is that AM is absolutely forbidden to look  inside them,
to try  to analyze them.   They  are {\it opaque}\1.   Most Intuition
functions  use numbers and  arithmetic, and it would  be pointless to
say that  AM  discovered such  concepts  if it  had access  to  those
algorithms all along.

\noindent $\bullet \ $   The  second  characteristic   of  an  Intuition  is  that  it  be
{\it fallible}\1.  As  with human  intuition, there  is no  guarantee
that what is suggested  will be verified even empirically,  let alone
formally.   Not  only  does this  make the  programming  of Intuition
functions easier, it was meant  to provide a degree of ``fairness"  to
them.  AM  wasn't cheating quite as much if  the See-saw function was
only antisymmetric 90\%\ of the time.

\noindent $\bullet \ $   Nevertheless, the  intuitions are very  
{\it suggestive}\1.  Many
conjectures can be  proposed only via  Intuitions. Some analogies (see  the
next subsection) can also be suggested via common intuitions.

\yyskip

After  they were  coded  and running,  I decided  that  the intuition
functions  were  unfair;  they   contained  some  major   discoveries
``built-in"  to  them.   They  had  the  power  to propose 
otherwise-obscure  new concepts and potential relationships.  
They contributed nothing other than what was originally programmed into them;
\4they were not synergetic\1.
Due to this dubious  character of
the contributions by  AM's few Intuition functions,
they were removed  from the  system.  All  the examples  and all  the
discoveries  listed  in   this  document  were  made   without  their
assistance.

We  shall now drop this  de-implemented idea.  I  think there is some
real opportunity for research here.
For the benefit of any future researchers in this area,
let me point to the excellent
discussion of analogic representations in [Sloman 71].

 \SSSEC{Analogies}

\qq{The 
whole idea of analogy is that `Effects', viewed as a function of situation,
is a {\sl continuous} function.}{Poincar\'e}


As  with Views  and  Intuitions, this facet  is useful  for  shifting
between  one part  of the  universe  and another.   Views  dealt with
transformations between two specific concepts; Intuitions dealt  with
transformations  between a  bunch of  concepts and  a large  standard
scenario  which was  carefully hand-crafted  in advance.   In contrast,
this facet  deals with transforming  between a  list of concepts  and
another list of concepts.

Analogies  operate on a  much grander  scale than Views.  Rather than
simply transforming a few isolated items, they initiate the  creation
of many  new concepts.   Unlike Intuitions, they  are not  limited in
scope beforehand, nor are they opaque. They are dynamically proposed.

The  concept of  ``prime numbers"  is \4analogous\0  to the  notion of
``simple groups".
While not isomorphic, you might guess at a few relationships
involving simple groups just by my telling you this fact:
simple groups are to groups what primes are to numbers.\foo{
If a group is not simple, it can be factored. 
Unfortunately, the factorization of a group into simple groups is not unique.
Another analogizing contact: For each prime p, we can
associate the cyclic group of order p, which is of course simple. 
AM never came up with the concept of simple groups; this is just an illustration
for the  reader. }

Let's  take   3  elementary  examples,   involving  very  fundamental
concepts:


\noindent $\bullet \ $  AM was told how to {\it View} a set as if it were a bag.

\noindent $\bullet \ $  AM was told it could {\it Intuit} the relation ``$≥$''
as the predetermined ``See-saw"
function.

\noindent $\bullet \ $  AM, by itself, once {\it Analogize}d that these two 
lists correspond:


<Bags, Same-length, Operations $\Fscr ↓i$ on&into Bags>

<Bags-of-T's, Equality, Those $\Fscr ↓i$ restricted to Bags-of-T's>

\yskip

\noindent The 
concept of a bag,  all of whose elements are ``T"'s,  is the unary
representation  of  \4numbers\0  discovered  by  AM.  When the  above
analogy (third one) 
is first  proposed, there are many known  Bag-operations\foo{
i.e., all  entries on  In-dom-of(Bag) and  In-ran-of(Bag); a  few of
these are: Bag-insert, Bag-union, Bag-intersection}, but there are as
yet no numeric  operations\foo{ Examples of Operation  whose domain/range
contains ``Number". }.  This triggers one of  AM's heuristic rules, which
spurs AM on to finding the analogues of specific Bag-operations. That
is, what  special properties  do the  bag-operations have when  their
domains and/or  ranges are restricted from  Bags to Bags-of-T's (i.e,
Numbers).    In  this  way,  in  fact,  AM  discovers   Addition  (by
restricting Bag-union to the  Domain/range <Bags-of-T's Bags-of-T's $\ \→\ $
Bags-of-T's>), plus many other nice arithmetic functions.

Well,  if it  leads  to the  discovery of  Addition, that  analogy is
certainly worth having.  How would an analogy like  that be proposed?
As the reader might expect by now, the mechanism
is simply  some heuristic rule adding it as an entry to
the Analogies facet of a certain concept.  For example:

\yskip

\han3{{\it IF the current task has just created a canonical specialization C2 of
concept C1, with respect to operations F1  and F2, [i.e., two members
of C2 satisfy F1 iff they satisfy F2],}}

\han3{{\it THEN add the following entry to the Analogies facet of C2:}}


\han4{{\it  <C1,\ F1,\ Operations-on-and-into(C1)>}}

\han4{{\it  <C2,\ F2,\ Those operations restricted to C2's>}}

\yskip

After  generalizing ``Equality"  into the operation  ``Same-length", AM
seeks to  find a  canonical\foo{ A  natural, standard  form.   All  bags
differing in only  ``unimportant" ways should be  transformed into the
same canonical  form.  
Two bags B1 and B2 which have the same length should get transformed into
the same canonical bag.
} representation for Bags. That is, AM seeks a
canonizing function f, such that (for any two bags x,y)
$ Same-length(x,y) \↔  Equal( f(x), f(y) ).$\0
Then the range of f would delineate the set of  ``canonical" Bags.  AM
finds  such an  f and  such a  set of  canonical bags:  the operation f
involves replacing each element  of a bag by  ``T", and the  canonical
bags are those whose elements  are all T's.  In this  case, the above
rule   triggers,   with   C1=Bags,  C2=Bags-of-T's,   F1=Same-length,
F2=Equality, and the analogy  which is produced  is the one shown  as
the third example above.

The  Analogy facets  are not  implemented in  full generality  in the
existing LISP version of AM, and for that reason I shall refrain from
delving deeper into  their format.   Since good research has  already
been  done  on reasoning  by  analogy\foo{  An  excellent discussion  of
reasoning by analogy is found  in [P\'olya 54]. Some early work on  emulating
this was reported in [Evans 68]; a  more recent  thesis on  this topic  is
[Kling 71].  }, I  did not view it as a central feature of my work. Very
little space will be devoted to it in this document.

An important  type  of analogy  which was  untapped  by AM  was  that
between heuristics.  If two  situations were similar, then conceivably the
heuristics  useful in one  situation might be useful  (or have useful
analogues) in the  new situation.   Perhaps this is  a viable way  of
enlarging  the known heuristics.   Such ``meta-level"  activities were
kept to a minimum throughout AM, and this 
proved to be a serious limitation.

Let me  stress  that the  failure  of  the Intuitions  facets  to  be
nontrivial was due to  the lack of spontaneity which  they possessed.
Analogies  facets were useful  and ``fair"  since their uses  were not
predetermined by the author.

 \SSSEC{Conjec's}

The ``Conjec" facet of a concept C is a list of
relationships which involve C.
There are several uses for C.Conjec:


\noindent $\bullet \ $  Store awkwardly-phrased conjectures: this wasn't really useful,
since most conjectures fell out naturally as simple relationships,
expressable, e.g., as a single Genl arc, or a single Isa arc.

\noindent $\bullet \ $  Store flimsy conjectures: apparent 
relationships worth remembering but  not quite believed yet.

\noindent $\bullet \ $  Hold
heuristics which notice and check conjectures.

\noindent $\bullet \ $  Obviate the need for many nearly-indistinguishable concepts:
Collapse the entire essence of a  concept like ``Odd-primes"
 into one or two relationships
involving ``Primes"; then discard ``Odd-primes".

\noindent $\bullet \ $  Untangling paradoxes: a historic record, to facilitate
backtracking in case of catastrophe, which (luckily!)  wasn't ever needed.

\noindent $\bullet \ $  Improve existing algorithms (e.g., once you know primes are odd,
hunt only through odd numbers), improve
testing procedures, representations, etc.

\noindent $\bullet \ $  Display  AM's most impressive observed relationships in a form which is easily
inspectable by the user.

\yskip

The syntax of this facet is simply a list of conjectures, where each conjecture
has the form of a relationship: (R a b c$\ldots $d). R is the name of a known operation
(in which case, abc$\ldots $ are its arguments and we claim that d is its value).
For instance,  ``\it (Same-size Insert(S,S) S False)\1" is a conjecture
that
inserting a set into itself
will always  give you a set of a different length.
This conjecture happens to be true for finite sets.

 \SSSEC{Definitions}

A typical way to disambiguate a concept from all others is to provide
a ``definition" for it.\foo{ As EPAM studies showed
[Feigenbaum 63],
one can
never be sure that this definition will specify the  concept uniquely
for all time.  In the distant future, some new  concept may differ in
ways  thought to  be  ignorable at  the present  time. }  Almost every
concept  had some  entries  initially  supplied on  its  ``Definitions"
facet.   The  format of this  facet is  a list  of entries,  each one
describing a  separate  definition.  A single  entry  will  have  the
following parts:

\yskip

\noindent $\bullet \ $       Descriptors:     Recursive/Linear/Iterative,      Quick/Slow,
Opaque/Transparent, Once-only/Early/Late, Destructive/Nondestructive.

\noindent $\bullet \ $   Relators: Reducing  to the  definition  of concept  X, Same  as Y
except$\ldots $, Specialized version of Z, Using the definition of W, etc.

\noindent $\bullet \ $  Predicate: A small, executable piece of LISP code, to tell  if any
given item is an example of this concept.

\yskip

The semantics of this are that
the predicate or ``code" part  of the entry (i) 
must be faithfully  described by the
Descriptors,  and (ii) must be related to other  concepts just as the Relators
claim. 

Here is a typical entry from the Definitions facet of the Set-union concept:

\yskip

\han3{{\6 Descriptors: {\it Slow, Recursive, Transparent }}}

\han3{{\6 Relators: {\it Uses the algorithm for Set-insert, Uses the definition of Empty-set,
Uses the definition of Set-equal, Uses the algorithm for Some-member, 
Uses the algorithm for Set-delete, Uses the definition of Set-union }}}

\han3{{\6 Code: {\it $\lambda $ (A B C) }}}

\han5{{\bf IF \it   Empty-set.\ Defn(A)}}

\han5{{\bf THEN \it  Set-equal.\ Defn(B,C) }}

\han5{{\bf ELSE }}

\han6{{\it X $←$ Some-member.\ Alg(A)\foo{The  
notation ``X $←$  Some-member.Alg(A)" means that  any one algorithm
for the concept Some-member should be accessed, and then it should be
run on the argument A. The result,  which will be an element of A, is
to be assigned the name ``X".  The effect is to bind the variable X to
some member of set A.} }}

\han6{{\it A $←$ Set-delete.\ Alg(X,A) }}

\han6{{\it B $←$ Set-insert.\ Alg(X,B) }}

\han6{{\it Set-union.\ Defn(A,B,C) }}

\yskip


\noindent Let me  stress that this  is just  one entry from  one facet of  one
concept.



This particular definition is  not very efficient, but it  is described
as  Trans\-par\-ent.  That means  it is very well  suited to analysis and
modification by  AM itself.   Suppose  some heuristic  rule wants  to
generalize this definition. It can peer inside it, and, e.g., replace
the  base step call  on Set-equal, by  a call on  a generalization of
Set-equal (say ``Same-length"\foo{ For disjoint sets,  the new definition
would specify the operation which we call ``addition". }).

How could  \4different\0 definitions help here? Suppose  there were a
definition which first  checked to  see if the  three arguments  were
Set-equal to each  other, and if so  then it instantly returned  T as
the  value of the  definition predicate; otherwise,  it recurred into
Set-union.Defn again.  This might be  a good algorithm to try at  the
very beginning, but if the Equality test fails, we don't want to keep
recurring  into this definition.   This algorithm should  thus have a
Descriptor labelling it ONCE-ONLY EARLY.

There are three purposes to having Descriptors and Relators hanging
around:


\noindent 1. For the benefit  of the user. AM appears  more intelligent because
it can \4describe\0 the kind of definition it is using --- and why.

\noindent 2. For the  sake of efficiency. When all AM wants to do is to evaluate
Set-union(A,B),  it's best just to grab a \4fast\0 definition.  When   trying  to
generalize Set-union, it's more appropriate to
modify a very clean, transparent definition --- even if it is a slow one.  

\noindent 3. For the  benefit of  the  heuristic rules.  Often, a  left-  or a
right-hand-side will  ask about  a certain  kind of  definition.  For
example, \it ``If  a transparent  definition of  X exists,  then try  to
specialize X"\1.

\yskip

Let me pull back the  curtain a little further, and expose the actual
implementation of  these  ideas in  AM.    The secrets  about  to  be
revealed will  not be  acknowledged anywhere  else in this  document.
They  may,  however, be  of  interest to  future  researchers.   Each
concept may have a cluster of Definition facets, just as it  can have
several  kinds  of  Examples   facets.  These  include  three  types:
Necessary  and  sufficient  definitions,  necessary  definitions, and
sufficient  definitions.     These   three  types   have  the   usual
mathematical  meanings.   All that  has been  alluded to  before (and
after this subsection) is the necc&suff  type of definition (x is  an
example of C \4if and only if\0  x satisfies C.Def/necc&suff). Often,
however,  there  will  be a  much  quicker  sufficient definition  (x
satisfies C.Def/suf, \4only  if\0 x  is certainly a  C).   Similarly,
entries  on C.Def/nec  are  useful  for quickly  checking  that x  is
\4not\0 an  example of C (to check this, it suffices to verify that x
\4fails\0 to satisfy a necessary definition of C).

So given the task of deciding whether or not x is an example of C, we
have many alternatives:


(i) If x is a concept, see if C is a member of x.ISA (if so, then x is
an example of C).

(ii) Try to locate x within C.Exs. (depending upon the flavor of subfacet
on which x is found,  this may show that x is or is not
an example of C).

(iii) If x is a concept, ripple to  collect ISA's(x), and see if C is  a
member of ISA's(x).

(iv) If there is a  fast sufficent definition of C, see  if x satisfies
it.

(v) If  there is a fast  necessary definition of C, see  if x fails it
(if so, then x is \4not\0 an example of C).

(vi) If there is a necessary and sufficient definition of C, see whether
or not x satisfies that definition (this may show that x is or is not
an example of C).

(vii) Try to locate x within C.Exs. (depending upon the flavor of subfacet
on which x is found,  this may show that x is or is not
an example of C).

(viii) Recur:  check to see if x is an example of any specialization of
C.

(ix) Recur: check  to  see  if  x  is \4not\0  an  example  of  some
generalization of C (if so, then x is \4not\0 an example of C).

\yskip

\noindent In fact,  there is a LISP function,  IS-EXAMPLE, which performs those
nine steps in that order.  At each  moment, there is a timer set, so  even
if there is a necessary and sufficient  definition hanging around, it
might run out of time before settling the issue one way or the other.
Each time the function recurs,  the timer is granted a smaller  and
smaller quantum, until finally it  has too little to bother recurring
anymore.  There is a potential overlap of activity: to see if x is an
example of  C, the  function might  ask  whether x  is or  is not  an
example of a particular generalization  of C (step (ix), above); to test
\4that\1, AM might get to step (viii), and again ask if x is an example of
C. Even though the timer would eventually  terminate this fiasco (and
even  though  the true  answer  might be  found  despite  this wasted
effort) it  is  not  overly smart  of  AM  to fall  into  this  loop.
Therefore,  a  stack  is  maintained,   of  all  concepts  whose definitions the
IS-EXAMPLE  function tried to  test on  argument x.   As the function
recurs, it adds  the current value of  C to that stack;  this value
gets removed when the  recursion pops back to this level, when that recursive call
``returns" a value.

 \SSSEC{Algorithms}

Earlier, we said that each concept can have any facets from the universal fixed set
of 25 facets.
This is not strictly true. Sometimes, a whole class
of concepts will possess a certain type of facet which no others may meaningfully
have. 
That is, there will be a \4domain of applicability\0 for the facet, just
as we defined such domains of applicability for heuristics.
For example, consider the ``Domain/Range" facet. It is meaningful only to
``operations".


Let's view this in a more general light.
For each facet f, the only concepts which
can have entries for facet f are examples of some particular concept $Dscr $(f).
$Dscr $(f) is the domain of applicability of facet f.
If C is any concept which is not an example of $Dscr $(f), then it can never
meaningfully possess any entries for that facet f.
For almost all facets f, $Dscr $(f) is ``Any-concept". Thus any concept can possess
almost any facet. 
For example, $Dscr $(Defn)=``Any-concept", so any concept may have definitions.
But $Dscr $(Domain/range)=``Operation". So only operations can have domain/range facets.

Similarly, $Dscr $(Algorithms)=``Actives". This facet is the subject of this section.
The Algorithms facet is present for all --- but only
for --- Actives
(predicates, relations, and operations).

The representation is, just as with Definitions, 
a list of entries, each one
describing a separate algorithm, and each one having three parts:
Descriptors, Relators, Program.
Instead of a LISP predicate, however, the Algorithms facets possess a LISP function
(an executable piece of code whose value 
will in general be other than True/False).

All the details about understanding the descriptors and relators are embedded in the
fine structure of the heuristic rules. A left-hand-side may test whether a
certain kind of algorithm exists for a given concept. A right-hand-side which
fills in a new algorithm must also worry about filling 
in the appropriate descriptors
and relators. As with newly created concepts, such information is trivial to fill in
at the time of creation, but becomes much harder after the fact.

Here is a typical heuristic rule which results in a new entry being added to the
Algorithms facet of the newly-created concept named 
Compose-Set-Intersect\-&\-Set-Intersect\foo { The 
operation ``$ \inter \circ \inter $" takes 
three sets and intersects them}:

\han2{{\it IF the task is to Fillin Algorithms for F, }}

\han3{{\it and F is an example of Composition}}

\han3{{\it and F has a definition of the form $ F =↓{df} G\circ H$,}}

\han3{{\it and F has no transparent, nonrecursive algorithm,}}



\han2{{\it THEN add a new entry to the Algorithms facet of F,}}

\han3{{\it with Descriptors: Transparent, Non-recursive}}

\han3{{\it with Relators: Reducing to G.Alg and H.Alg, 
Using the Definition of <G.Domain>}}

\han3{{\it with Program:}}

\han5{{\it  $\lambda  (\|<G.\,Domain>\|,\,\|<H.\,Domain>\| - 1,\,X)$}}


\han6{{\it (SETQ X (H.\,Alg $\|$<G{\2.}Domain>$\|$))}}

\han6{{\it (AND }}


\han7{{\it (<G{\2.}Domain>{\2.}Defn X) }}

\han7{{\it (G{\2.}Alg X $\|$<H{\2.}Domain>$\| - 1$))}}

\yskip

\noindent The intent of the little program which gets created
is to apply the first operator, check that the
result is in the domain of the second, and then apply the second operator.
The expression $\|<G.\,Domain>\|$ means find a domain/range entry for G,
count how many domain components there are, and form a list that long
from randomly-chosen variable names {\it (u,v,w,x,y,z).}

For the case mentioned above, F = Compose-Set-Intersect&Set-Intersect, 
G = Set-Intersect,
and H = Set-Intersect. The domain of
G is a pair of Sets, so

\noindent $\|<G.\,Domain>\|$ is a list
of 2 variables, say ``{\it (u v).}" Similarly, 
$\|<H.\,Domain>\| - 1$\ 
is a list
of 1 variable, say ``{\it (w)}." Putting all this together, we see that
the new definition entry created for
Compose-Set-Intersect&Set-Intersect
would look like this:

\yskip

\han3{{\6 Descriptors: {\it Non-Recursive, Transparent }}}

\han3{{\6 Relators: {\it Reducing to Set-Intersect{\2.}Alg, Using the definition of Sets }}}

\han3{{\6 Code: {\it $\lambda $ (u,v,w,X) }}}

\han6{{\it (SETQ X (Set-Intersect{\2.}Alg u v)) }}

\han6{{\it (AND }}

\han7{{\it (Sets{\2.}Defn X) }}

\han7{{\it (Set-Intersect{\2.}Alg X w) }}

\yyskip

At times, AM will
be capable of producing only a slow algorithm for some new concept C.
For example, TIMES\inv (x) was 
originally defined by AM as a blind, exhaustive search
for bags of numbers whose product is x. As AM uses that algorithm more and more,
AM records how slow it is. Eventually, a task is selected of the form
\6``Fillin new algorithms for C"\1, with the two reasons being that the existing
algorithms are all too slow, and they are used frequently.
At this point, AM should draw on a body of rules which take a declarative
definition and transform it into an efficient algorithm, or which take an
inefficient algorithm and speed it up. Doing a good job on just those rules
would be a mammoth undertaking, and the author decided to omit them.
The reader who wishes to know more about rules for creating
and improving LISP algorithms is directed to [Darlington and Burstall 73].
A more general discussion of the principles involved can be found in [Simon 72].

 \SSSEC{Domain/Range}

Another facet possessed only by active concepts is Domain/Range.
The syntax of this facet is quite simple. It is a list of entries, each of the form

\noindent \6< $D↓1 D↓2 \ldots  \ \→\  R$ >\1, where
there can be any number of $D↓i$'s 
preceding the arrow,
and $R$ and all the $D↓i$'s are the names of concepts.
Semantically, this entry means that the active concept may be run on
a list of arguments 
where the first one is an example of $D↓1$, the second an example of $D↓2$, 
etc.,
and in that case will return a value
guaranteed to be an example of $R$.
In other words, the concept may be considered a relation on the 
Cartesian product
$D↓1\times D↓2\times \cdots \ \times R$.
We shall say that the \4domain\0 of the concept is
$D↓1\times D↓2\times  \cdots \,$, and that its \4range\0 is $R$. Each $D↓i$
is called a 
\4component\0 of the domain.


For example, here is what the Domain/Range facet of TIMES might look like:

\han2{$\{$}

\han3{{\it < Numbers Numbers $\ \→\ $ Numbers > }}

\han3{{\it < Odd-numbers Odd-numbers $\ \→\ $ Odd-numbers > }}

\han3{{\it < Even-Numbers Even-Numbers $\ \→\ $ Even-numbers > }}

\han3{{\it < Odd-numbers Even-Numbers $\ \→\ $ Even-Numbers > }}

\han3{{\it < Perf-Squares Perf-Squares $\ \→\ $ Perf-Squares > }}

\han3{{\it < Bags-of-Numbers $\ \→\ $ Numbers > }}

\han2{$\}$}

\yskip


The Domain/Range part is useful for pruning away absurd compositions, and
for syntactically suggesting compositions and ``coalescings". 
Let's see what this means.


Suppose some rule sometime tried to compose TIMES$\circ $Set-union.
A rule tacked onto Compose says to ensure that the range of Set-union
at least intersects 
(and preferably is \4equal\0 to)
some component of the domain of TIMES. But there
are no entities which are both sets and numbers;
ergo this fails
almost instantaneously.


The claim was also made that Domain/Range facets help propose plausible coalescings.
By ``\4coalescing\1'' an operation, 
we mean defining a new one, which differs from the original one in that
a couple of the arguments must now coincide. For example,
coalescing TIMES(x,y) results in the new operation F(x) defined as TIMES(x,x).
Syntactically, we can coalesce a pair of domain components of the domain/range
facet of an operation if those two domain components are equal, or if
one of them is a specialization of the other, or even if they merely intersect.
In the case of one related to the other by specialization, the more specialized
concept will replace both of them, In case of merely intersecting, an extra test
will have to be inserted into the definition of the new coalesced operation.

Given this domain/range entry for Set-insert: < Anything Sets $\ \→\ $ Sets >,
we see that
it is ripe for coalescing. Since Sets is a specialization of Anything, the new
operation F(x), which is defined as Set-insert(x,x), will have a domain/range
entry of the form < Sets $\ \→\  $ Sets >. That is, the specialized concept Sets
will replace both of the old domain elements (Anything and Sets).
F(x) takes a set x and inserts it into itself. 
Thus F($\{a,b\}$) = $\{a,b,\{a,b\}\}$.
In fact, this new operation F is very exciting because it always seems to give
a new, larger set than the one you feed in as the argument.

We have seen how the Domain/range facets can 
prune away meaningless coalescings, as well as meaningless
compositions. Any proposed composition or coalescing will at least be syntactically
meaningful. If all compositions are proposed only for at least one good 
semantic reason, then those passing the domain/range test, and hence
those which ultimately get created, will all be valuable new concepts.
Since almost all coalescings are semantically interesting, \4any\0 of them which
have a valid Domain/Range entry will get created and probably will be interesting.

This facet is occasionally used to suggest conjectures to investigate. For example,
a heuristic rule says that if the domain/range entries have the form
<$D\, D \,D \,\ldots   \ \→\  genl(D)$\foo{ ``{it genl(D)}" means: 
any generalization of concept D. }>, 
then it's worthwhile seeing whether the value of this
operation doesn't really always lie inside $D$ itself. 
This is used right after the
Bags$\↔$Numbers analogy is found, 
in the following way. One of the Bag-operations
known already is Bag-union. The analogy causes AM to consider a new operation,
with the same algorithm as Bag-union, but restricted to Bags-of-T's 
(numbers in unary
representation). The Domain/range facet 
of this new, restricted mutation of Bag-union
contains only this entry:
<Bags-of-T's Bags-of-T's $\ \→\ $  Bags>. 
Since Bags is a generalization of Bags-of-T's, the
heuristic mentioned above triggers, 
and AM sees whether or not the union of two Bags-of-T's is
always a bag containing only T's. It appears to be so, even in extreme cases, so the
old Domain/range entry is replaced by this new one:
<Bags-of-T's Bags-of-T's $\ \→\ $  Bags-of-T's>.
When the user asks AM to call these bags-of-T's ``numbers", this entry becomes
<Numbers Numbers $\ \→\  $ Numbers>. 
In modern terms, then, the conjecture suggested was that the sum of two numbers
is always a number.

To sum up this last ability in fancy language,
we might say that
one mechanism for proposing
conjectures is the prejudicial belief in the unlikelihood of asymmetry.
In this case, it is asymmetry in the parts of a Domain/range entry that
draws attention.
Such conjecturing can be done by any action part of any heuristic rule;
the Conjec facet entries don't have a monopoly on initiating this type of activity.

 \SSSEC{Worth}

\noindent How  can we  represent  the  worth of  each  concept? Here  are  some
possible suggestions:

\yskip

1. The  most intelligent 
(but most difficult)
solution is  ``purely symbolically". That is,
an individualized  description of  the  good and  bad points  of  the
concept; when it is useful, when misleading, etc.

2. A  simpler solution would  be to ``standardize" the  above symbolic
description  once and for all, fixing  a universal list of questions.
So each concept would have to answer the questions  on this list (How
good  are  you  at  motivating  new  concepts?, How  costly  is  your
definition to  execute?,$\ldots $). The  answers  might each  be  symbolic;
e.g., arbitrary English phrases.

3. To simplify this scheme even more, we  can assume that the answers
to  each question  will be numeric-valued  functions (i.e,  LISP code
which can be evaluated  to yield a number between  0 and 1000).   The
vector of numbers produced by {\it Eval}uating all these functions will
then  be easy to manipulate  (e.g. using dot-product, vector-product,
vector-addition, etc.), and the functions themselves may be inspected
for semantic content.   Nevertheless, much content is lost in passing
from symbolic phrases to small LISP functions.

4. A slight simplification  of the above would  be to just store  the
vector of numbers answering  the fixed set of questions;  i.e., don't
bother storing a bunch of programs which compute them dynamically.

5. Even  simpler would be to try  to assign a single ``worthwhileness"
number to each  concept, in lieu  of the vector  of numbers.   Simple
arithmetic operations  could manipulate Worth  values then.   In some
cases,  this linear  ordering seems  reasonable (``primes"  really are
better than ``palindromes".) Yet in many cases we  find concepts which
are  too different  to  be so  easily compared  (e.g.,  ``numbers" and
``angles".)

6. The least  intelligent solution is  none at all:  each concept  is
considered equally worthwhile  as any other concept.   This threatens
to be combinatorial dynamite.

\yskip

\noindent As  we progress  along the  intelligent
$===\→$ trivial dimension,  we find
that the schemes get easier and easier to code, the Worth  values get
easier and easier to deal with,  but the amount of reliable knowledge
packed into them decreases.

Initially, scheme \6$\#$3\0 above was chosen for AM: a vector of numeric-valued
procedural
answers to a  fixed set of questions.  Here are those questions,  the
components of the Worth vectors for each concept:

\yskip

\han3{ $\bullet $  Overall aesthetic worth. }

\han3{ $\bullet $  Overall utility. Combination of usefulness, ubiquity. }

\han3{ $\bullet $  Age. How many cycles since this concept was created? }

\han3{ $\bullet $  Life-span. Can this concept be forgotten yet? }

\han3{ $\bullet $  Cost.  How much cpu time has been spent on this concept, since its 
creation? }

\yskip

\noindent Notice that in  general no constant  number can answer  one of  these
questions once and for all (consider, e.g., Life-span). Each `answer'
had to be a numeric-valued LISP function.

A  few questions which  crop up often  are not present  on this list,
since they can  be answered trivially  using standard LISP  functions
(e.g.,  ``How much  space does  concept  C use  up?" can  be found  by
calling  the function ``COUNT"  on the property-list of  the LISP atom
``C").

Another kind of question, which was anticipated and did  in fact come
up frequently, is of the form ``How good are the entries on facet F of
this concept?", for various  values of F.   Since there are a  couple
dozen kinds  of facets, this  would mean adding  a couple  dozen more
questions to the list. The line must be drawn somewhere.  If too much
of AM's time  is drained by  evaluating where it  is already, it  can
never progress.

The  heuristic rules  are responsible  for initially  setting up  the
various  entries  on  the  Worth  facets  of  new  concepts, and  for
periodically altering  those entries  for \4all\0  concepts, and  for
delving into those entries when required.


Recent experiments\foo{ See Experiment 1, which is described in
subsection 6.2.2. } have shown that
there was little change in behavior when each vector of functions was
replaced by  a  single numeric  function (actually,  the  sum of  the
values  of the components  of the  ``old" vector of  functions). There
wasn't even  too  much change  when this  was  replaced by  a  constant.
There \4was\0 a noticeable degradation (but no collapse) when
all the concepts' Worth numbers were set equal to each other initially.

For the purposes of this document, then (except for this page and the
discussion of Experiment 1), we may as well assume  that each concept
has  a single number  (between 0  and 1000)  attached as  its overall
``Worth" rating. This number  is set\foo{ The  author initially sets  this
value for the 115 initial concepts.  Heuristic  rules set it for each
concept  created by AM.   } and  referenced and  updated by heuristic
rules.  Experiment 1  can  be  considered  as  showing  that  a  more
sophisticated Worth scheme is not  necessary for the particular kinds
of behaviors that AM exhibits.

 \SSSEC{Interest}

Now that we  know how how to judge  the overall worth of  the concept
``Composition",  let's turn  to the question  of how  interesting some
specific composition is.  Unfortunately, the  Worth facet really  has
nothing to say about that problem. The Worth of the concept ``Compose"
has  little effect  on how  interesting a particular  composition is:
``Count$\circ$Divisors-of" is  very  interesting, and
 ``Insert$\circ$Member"\foo{
INSERT$\circ $MEMBER(x,y,z) is defined to be: 
if  x$\epsilon $y, then insert  `T' into z,  else insert
`NIL' into z. } is less so.   The Worth facets of \4those\0  concepts
will say something about their overall value.   And yet there is some
knowledge,  some ``features"  which would  make any  composition which
possessed them more interesting than a composition which lacked them:

\yskip

\noindent $\bullet $ Are the domain and range of the composition equal to each other?

\noindent $\bullet $ Are interesting  properties  of  each component  of  the  composition
preserved?

\noindent $\bullet $ Are  undesirable  properties  lost  (i.e.,  \4not\0  true  about  the
composition)?

\noindent $\bullet $ Is the new composition equivalent to some already-known operation?

\yskip

These  hints  about ``features  to look  for"  belong tacked  onto the
Composition concept,  since they modify  all compositions. Where  and
how can this be done?

For this  purpose each concept --- including  ``Composition" --- can have
entries on its ``\4Interest\1'' facet. It contains a bunch of  features
which (if  true) would  make any  particular example  of the  current
concept interesting.

The format for the Interest facet is as follows:

\yskip \eightpoint \bf

\han4{< Conflict-matrix  }

\han5{<Feature$↓1$, Value$↓1$, Reason$↓1$, Used$↓1$>}

\han5{<Feature$↓2$, Value$↓2$, Reason$↓2$, Used$↓2$>}

\han6{ $\vdots $ }

\han5{<Feature$↓k$, Value$↓k$, Reason$↓k$, Used$↓k$>}

\han4{ >}

\yskip

\tenpoint \1

\noindent This is  the format  of the  facet itself,  not of each  entry.   The
conflict-matrix  is  special  and  will  be  discussed below.    Each
Feature/Value/Reason/Used quadruple will be termed an ``entry"  on the
Interest facet.

Each ``Feature$↓i$''  is a LISP  predicate, indicating whether  or not
some   interesting   property   is   satisfied.   The   corresponding
``Value$↓i$'' is a  numeric function for  computing just how  valuable
this feature  is.  The ``Reason$↓i$''  is a token  (usually an English
phrase) which is tacked along and moved around, and can be  inspected
by the user.  The  ``Used$↓i$" subpart is a list of  all the concepts
whose  definitions  are known  to incorporate\foo{  Not  {\it satisfy} the
feature. Thus  the general  concept Domain=Range-op  {\it incorporates}
the feature  ``range(x) is one component  of domain(x)" as just  one of
the  conjuncts  in  its  definition.  On  the  other hand,  Set-union
{\it satisfies} the  feature,  since its  range,  Sets, really  is  one
component  of  its  domain. }  this  feature;  all  examples of  such
concepts will then automatically satisfy this Feature$↓i$.


For example, here is one entry from the Interest facet of Compose:

\yskip

\han3{{\6FEATURE: \it Domain(Arg1)=Range(Arg2)}}

\han3{{\6VALUE: \it  $ .4 + .4\cdot Worth(Domain(Arg1)) + 
 .2\cdot Priority(current task)$ }}

\han3{{\6REASON: \it ``The composition of  Arg1 and Arg2 will  map from C back
into C"}}

\han3{{\6USED: \it Compose-with-self-Domain=Range-operation, 
Interesting-compose-4}}

\yskip


Just  as with  Isa's  and  Generalizations,  we can  make  a  general
statement about Interest features:

\yskip

\han4{\6Any  feature  tacked  onto the  Interest  facet  of  any member  of
ISA's(C), also applies to C.}

\yskip

That is, X.Interest is relevant to C  iff C is an example of X.   For
example, any feature which makes an operation interesting, also makes
a composition interesting.

So we'd  like to define the function Interests(C) as the union of the
Interest features found tacked onto any member  of ISA's(C).\foo{ Recall
that     the      formula     for     this      is     ISA's(C)     =
Generalizations(Isa(Generalizations(C))). } But  some of these  might
have already been  conjoined to a  definition, to form the  concept C
(or  a  generalization  of  C).    So  all  C's  will  trivially  (by
definition) satisfy such features. The USED subparts can  be employed
to find such  features.  In fact, the final  value of Interests(C) is
Interest(ISA's(C)) {\it minus}  all the
features whose USED subparts pointed to any member of ISA's(C).

This covers the  purpose of each subpart  of each entry on  a typical
Interest  facet. Now  we're  ready to  motivate the  presence  of the
Conflict-matrices.

Often, AM will specialize a concept by conjoining onto its definition
some  features   which  would  make   any  example  of   the  concept
interesting.  So \4any\0 example  of this new  specialized concept is
thus guaranteed to be an \4interesting\0 example of the  old concept.
Sometimes, however,  a pair of  features are exclusive: both  of them
can never be satisfied simultaneously. For example, a composition can
be interesting if its domain and range coincide, but it can
also be interesting if its range is much more interesting than its domain.
Clearly,  these two Interestingness features  can't
both be true (``x=y" and ``x much more  interesting than y" can't occur
simultaneously).   If AM didn't  have some systematic  way to realize
this,   however,   it   might   create   a   new    concept,   called
Interesting-composition, defined  as any composition  satisfying both
of  those features.  But then  this concept  will be  vacuous: \4no\0
operation can possibly satisfy that over-constrained definition; this
new  concept will have  no examples;  it is the  null concept;  it is
trivially forgettable. Merely to think of it is a blot on AM's  claim
to rationality.  This is where the Conflict-matrix fits in.

The  ``Conflict-matrix" is  specified  to  prevent many  such  trivial
combinations  from eating up  a lot of  AM's time.
If  there are K features  present
for the Interest facet of  the concept, then its conflict-matrix will
be  a K$\times $K matrix. In row i, column  j of this matrix is a letter,
indicating the relationship between features i and j:



\noindent \6E \it \  Exclusive of each other: \rm they 
both can't be true at the same time.

\noindent $\→$ \it Implies: \rm If feature i holds, then feature j must hold.

\noindent $←$ \it Implied by: \rm If feature j holds, then so does feature i.

\noindent \6= \it Equal: \rm Feature i holds precisely when feature j holds.

\noindent \6U \it \  Unrelated: \rm  As far as known, 
there is no connection between them.

\yskip


These little relations are  utilized by some of the  heuristic rules.
Here is one  such rule.  Its purpose is  to create a new, specialized
form of concept C, if many  examples of C were previously found  very
quickly.


\han2{{\it IF Current-task is (Fillin Specializations of C)}}

\han3{{\it and $\|$C.Examples$\|$ > 30}}

\han3{{\it and Time-spent-on-C-so-far < 3 cpu seconds,}}

\han3{{\it and Interests(C) is not null,}}

\han2{{\it THEN create a new concept named Interesting-C,}}

\han3{{\it Defined as the conjunction of C.Defn and the highest-valued member of
Interests(C)  which  is {\6U}\  (unrelated)  to any  feature  USED  in the
definition of C.}}

\han3{{\it and  add the  following  task  to  the  agenda:  Fillin  examples  of
Interesting-C, with value computed as the Value subpart of the chosen
feature,   for  this   reason:  ``Any  example   of  Interesting-C  is
automatically an interesting example of C".}}

\han3{{\it and add ``Interesting-C" to the  USED subpart of the entry  
where that
feature was originally plucked from.}}

\yskip

 \SSSEC{Suggest}

This section describes a space-saving ``trick", and a ``fix-up" to undo
some  potentially serious  side-effects of that  trick.   
AM maintains two numeric threshholds:
Do-threshhold and a lower one, Be-threshhold.

When  a  new  task  is proposed,  if  its  global  priority  is below
Be-threshhold, then  it won't  even be  entered on  the agenda.  This
value is  set so low  that any task  having even one  mediocre reason
will make it onto the agenda.

After a task is finished executing, the top-rated one from the agenda
is  selected  to work  on  next.  If  its priority  rating  is  below
Do-threshhold,  however,  it  is  put  back  on  the agenda,  and  AM
complains that  no task  on the  agenda is  very  interesting at  the
moment. AM then  spends a minute or so looking  around for new tasks,
re-evaluating the priorities of the tasks on the agenda already, etc.

One  way to  find new  tasks (and  new reasons  for already-existing
tasks) is to evaluate the ``\4Suggest\1'' facets of all the concepts in
the  system.  More   precisely,  each  Suggest  facet  contains  some
heuristics, encoded into  LISP functions.   Each  function accepts  a
number N as an argument (representing the minimum tolerable priority value
for a
new task), and the function returns as its output a list of new tasks.
These are then merged into the agenda, if desired.

Semantically, each function  is one heuristic  rule for suggesting  a
new task which might be very plausible, promising, and \4a propos\0 at
the  current time.  For  example, here is one  entry from the Suggest
facet of Any-concept:

\yskip

\han3{{\it IF there are no examples for concept C filled in so far,}}

\han3{{\it THEN consider  the task  ``Fillin examples  of C",  for the  following
reason: ``No examples  of C filled in so far",  whose value is half of
Worth(C).  If that  value is below arg1,  then forget it;  otherwise,
try to add to to the agenda.}}

\yskip

\noindent The argument  ``arg1" is that  low numeric value,  N, supplied  to the
Suggest facet.

This  entry alone will  produce a  multitude of potential  tasks; for
concepts whose Worth numbers are high, or for which a task is already
on the agenda  to fill in their examples, these  suggested tasks will
be remembered; most of the other ones will typically be forgotten.

One use  of this facet is thus to ``beef up" the agenda whenever AM is
discontented with all the tasks thereon. At such a  time, AM may call
on all  the Suggest facets in  the system, and a large  volume of new
tasks will be  added to  the agenda. Many  of them  will exist  there
already,  but for  different  reasons, so  many  old tasks'  priority
values  will rise.   After  this  period of  suggesting is  over, the
agenda's highest-ranking task will hopefully have a higher value than
any  did  before.     Also  at  this  time,   the  Be-threshhold  and
Do-threshhold  numbers are reduced. So there  are two reasons why the
top task may  now be  rated higher than  Do-threshhold. If it  isn't,
then the threshholds are lowered again, and again all the Sugg facets
are triggered (this time with a lower N value).

Both threshholds  are  raised  slightly every  time  AM  succeeds  in
picking  and executing  a task.  So  they follow  a  pattern of  slow
increase,  followed by a  sudden decrement, followed  by another slow
increase, etc.   This  was  intended to  mimic a  human's  increasing
expectations  as he  makes  progress.
It also is suggestive of
the way a  human strains his mind  when an obstacle to  that progress
appears; if the straining doesn't produce a brilliant new insight, he
grudgingly is willing to reduce his expectations, and perhaps  resume
some ``old path" abandoned earlier.

 \SSSEC{Fill/Check}

\qq{To doubt everything doesn't suffice; one must know {\sl why} he 
doubts.}{Poincar\'e}


There is one more level of structure to AM's representation of a concept than
the simple ``properties on a property-list" image.
Each concept consists of a bunch of facets; each
facet follows the format layed down for it (and described in the
preceding several subsections).
Yet each facet of each concept can have two additional ``subfacets"
(little slots that are hung onto any desired slot) named \4Fill\-in\0 and
\4Check\1.

The ``Fill\-in" field of facet F of concept C is abbreviated 
C.F.Fill\-in.  The format of that subfield is a list of
heuristic rules,
encoded into LISP functions.
Semantically, each rule in C.F.Fill\-in should be relevant to
fill\-ing in entries for facet F of  any concept which is a C.
This substructure is an implementation answer to the question of where to place
certain heuristic rules.

As an illustration, let me describe a typical rule which is found on
{\bf Com\-pose.Ex\-am\-ples.Fill\-in}. 
According to the last paragraph, this must be useful for fill\-ing in
ex\-am\-ples of any operation which is a composition.
The rule says that if the composition A$\circ$B is
formed from two very time-consuming operations A and B, then it's worth
trying to find some ex\-am\-ples of A$\circ$B by symbolic means; in this case,
scan the ex\-am\-ples of A and of B, for some pair of ex\-am\-ples
x$→$y (ex\-am\-ple of B) and y$→$z (ex\-am\-ple of A). Then posit that
x$→$z is an ex\-am\-ple of A$\circ$B.

As another  illustration, let me describe a typical rule which is found on
{\bf Compose.Conjec.Fill\-in}. It 
says that one potential conjecture about a given 
composition A$\circ$B is that it is unchanged from A (or from B). This happens
often enough that it's worth examining each time a new composition is made.
This rule applies precisely to the task of fill\-ing in conjectures about particular
compositions. 

Let's take yet a third illustration.
The subfacet {\bf Any-Concept.Ex\-am\-ples.Fill\-in}\0 
is quite large; it contains all the known
methods for fill\-ing in ex\-am\-ples of C (when all we know is that C is a concept).
Here are a few of those techniques\foo{ The interested reader will find them all listed
in Appendix 2.2.37.}:

\yskip

\noindent $\bullet $  Instantiate C.Defn 

\noindent $\bullet $  Search the
ex\-am\-ples facets of all the concepts on Generalizations(C) for ex\-am\-ples of C

\noindent $\bullet $   Run some of the concepts named in 
In-ran-of(C)\foo{ i.e., operations whose range
is C} and collect the resultant values.

\yskip

{\bf Any-Concept.Ex\-am\-ples.Check} is large for similar reasons.
A typical entry there says to examine each verified ex\-am\-ple of C: if
it is also an ex\-am\-ple of a specialization of C, then it must be removed
from C.Ex\-am\-ples and inserted\foo{ Conditionally. Since each concept is of finite
worth, it is allotted a finite amount of space. A random number is generated to
decide whether or not to actually insert this ex\-am\-ple into the Ex\-am\-ples facet of
the specialization of C. The more that specialized concept is ``exceeding its
quota",
the narrower the range that the random number must fall into to have that
new item inserted. The probability is never precisely 1 or 0. }
into the Ex\-am\-ples facet of that specialized concept.

Here is one typical entry from {\bf Operation.Domain/Range.Check}:

\yskip

\han3{{\it IF a domain/range entry has the form (D D D$\ldots   \ \→\ $ R),}}

\han4{{\it and all the D's are equal, and R is a generalization of D,}}

\han3{{\it THEN it's worth seeing whether (D D D$\ldots   \ \→\ $ D) 
is consistent with all known ex\-am\-ples of the operation.}}

\han3{{\it If there are no known ex\-am\-ples, 
add a task to the agenda requesting they be filled in.}}

\han3{{\it If there are ex\-am\-ples, and (D D D$\ldots  \ \→\ $ D) 
is consistent, add it to the Domain/range
facet of this operation.}}

\han3{{\it If there are some contradicting ex\-am\-ples, create a new concept which is defined as
this operation restricted to (D D D$\ldots \ \→\ $ D).}}

\yskip

\noindent Note that this ``Checking" rule doesn't just passively check the designated facet; it
actively ``fixes up" faulty entries, adds new tasks, creates new concepts, etc.
All the check rules are very aggressive in this way.
For ex\-am\-ple, one entry on {\bf No-multiple-elements-structure.Exam\-ples.Check}
will actually remove any multiple occurrences of an element from a structure.

The set Checks(C.F) of all relevant rules for
checking facet F of concept C is obtained as
(ISA's(C)).F.Check. That is, look for the Check subfacet of the
F facet of all the concepts on ISA's(C)).
Similarly, Fill\-ins(C.F) is the union of the Fill\-in subfacets of the  F facets of
all the concepts on ISA's(C).

When AM chooses a task like ``\6Fill\-in ex\-am\-ples of Primes\1", 
its first action is to
compute Fill\-ins(Primes.Exs). 
It does this by asking for ISA's(Primes); that is, a list of all concepts of which
Primes is an ex\-am\-ple. This list is: <Objects Any-concept Anything>.
So the relevant heuristics are gathered from  Objects.Exs.Fill\-in, etc.
This list of heuristics is then executed, in order
(last executed are the heuristics attached to Anything.Exs.Fill\-in).

 \SSSEC{Other Facets which were Considered}

Most facets (like ``Definitions") were anticipated from the very beginning
planning of AM, and proved just as useful as expected. 
Others (like ``Intuitions")
were also expected to be very important, yet were a serious disappointment. 
Still others (like ``Suggest") were unplanned and grumblingly acknowledged as
necessary for the particular LISP program that bears the name AM.
Finally, we turn to a few facets which were initially planned, and yet which
were adjudged useless around the time that AM was coded. They were therefore
never really a part of the LISP program AM, although they figured in its
proposal. Let me list them, and explain why each one was dropped.

\yskip
 
$\bullet $  \2 UN-INTERESTINGNESS.\1  This was to be similar to the 
Interest facet. It would
contain entries of the form feature/value/reason, where the feature would be
a \4bad\0 (dull, trivializing, undesirable, uninteresting) property that an
entity (a concept or a task) might possess. 
If it did, then the value component would return a
negative number as its contribution to the worth/priority of that entity.
This sounded plausible, but turned out to be useless in practice:
(i) There were very few features one could point to which explicitly indicated
when something was boring; (ii) Often, a conjunction of many such features
would make the entity seem unusual, hence interesting; (iii) Most entities
were viewed as very mediocre unless/until specific reasons to the contrary,
and in those cases the presence a few boring properties would be outshadowed
by the few non-boring ones. In a sea of mediocrity, there is little need to
separate the boring from the very boring. 

$\bullet $  \2 JUSTIFICATION.\1 
For conjectures
which were not yet believed with certainty, this part would contain all the known
evidence supporting hem.
This would hopefully be convincing, if
the user (or a concept) ever wanted to
know.
In cases of contradictions arising
somehow, this facet was to keep hold of the threads that could be untangled
to resolve those paradoxes. As described earlier, this duty could naturally be 
assumed by the Conjecs facet of each concept. The other intended role for this
facet was to hold sketches of the proofs of theorems. 
Unfortunately, the intended concepts for Proof and Absolute truth were
never implemented, and thus most of the heuristic rules which would have interacted
with this facet are absent from AM. It simply was never needed.

$\bullet $  \2 RECOGNITION.\1 
Originally, it was assumed that the location of relevant concepts and
their heuristics would be much more like a
free-for-all (pandemon\-i\-um) than an orderly rippling process.
As with the original use of BEINGs\foo{ Interacting knowledge modules, each module
simulating a different expert at a round-table meeting. See [Lenat 75b]. }, the
expectation was that each concept would have to ``shout out" its relevance whenever
the activities triggered some recognition predicate inside that concept.
Such predicates were to be stored in this facet.
But it quickly became apparent that the triggering predicates which were the
left-hand-sides of the heuristic rules were quick enough to obviate the need
for pre-processing them too heavily. Also, the only rules relevant to a given
activity on concept C always seemed to be attainable by rippling in a certain
direction away from C. This varied with the activity, and a relatively small table
could be written, to specify which direction to ripple in (for any given desired
activity). We see that for ``Fill-in examples of$\ldots $", the direction to ripple in
is ``Generalizations", to locate relevant heuristic rules.
For ``Judge interest of$\ldots $" the direction is also generalizations. For
``Access specializations of", the direction is Specializations, etc.
The only important point here is that the Recognition facet was no longer needed.

\SSEC{AM's Starting Concepts}


The first  subsection presents a  diagram of the  top-level (general)
concepts   AM   started   with,  with   the   lines   indicating  the
Generalizations/Specializations kinds of  relationships (single  line
links)  and  a  few  Examples/Isa's links  (triple  vertical  lines).
Several  specific concepts have  been omitted from  that picture. All
the concepts initially fed  to AM are then listed  alphabetically and
described.
Finally, in Section 2.5.2.3, we discuss
the choice of starting concepts.

 \SSSEC{Diagram of Initial Concepts}

\vskip 6in % Put the tree of concepts  diagram here

The diagram above represents the ``topmost" concepts
which AM had initially,
shown connected via
Specialization links (single lines) and Examples links (triple lines).
The only concepts not diagrammed are the 47 
\4examples\0 of the concept Operation.

Also, we should note that many entities exist in the system
which are not themselves concepts. For example, the number ``3", though it
be an \4example\0 of many concepts, is not itself a concept.
All entities which \4are\0 concepts are present on the list called
CONCEPTS, and they all have property lists (with facet names as the
properties). In hindsight, this somewhat arbitrary scheme is regrettable.
A more aesthetic designer might have come up with a more uniform system
of representation than AM's.

 \SSSEC{Summary of Initial Concepts}


Since the precise set of concepts is not central to the design of AM,
or the quality of behaviors of AM, they are not worth detailing here.
On the  other  hand,  a  cursory familiarity  with  their  names  and
definitions should aid the reader  in building up an understanding of
what AM  has done.  For that reason, the concepts will now be briefly
described, in alphabetical order.

\6ACTIVITY\0 represents  something  that  can be  ``performed".    All
Actives  --- and  \4only\0  Actives ---  have  Domain/range facets  and
Algorithms facets.

\6ALL-BUT-FIRST-ELEMENT\0  is  an  operation which  takes  an ordered
structure and removes  the first element from  it.  It is  similar in
spirit to the Lisp function ``CDR".

\6ALL-BUT-LAST-ELEMENT\0 takes  an ordered structure  and removes its
last element.

\6ANY-CONCEPT\0 is  useful  because it  holds  all the  very  general
tactics for filling  in and checking  each facet.  The  definition of
Any-concept is \6``$\lambda $ (x) x$\epsilon $CONCEPTS"\1. 
`\6CONCEPTS\1' is AM's global
list of entities known to be concepts. Initially, this  list contains
the hundred  or so  concepts which  AM starts  with (e.g.,  all those
diagrammed on the preceding page).


\6ANYTHING\0  is defined  as \6``$\lambda $ (x)  T"\1; i.e.,  a predicate which
will \4always\0 return  true.   Notice that the  singleton $\{a\}$ is  an
example of Anything, but (since it's not on the list \6CONCEPTS\0) it
is not an example of Any-concept.

\6ATOM\0 contains  data  about  all  primitive,  indivisible  objects
(identifiers, constants, variables).

\6BAG\0  is a  type of  structure.   It  is  unordered, and  multiple
occurrences of the same element are permitted. They are isomorphic to
the concept known as `multiset',  except that we stipulate that  sets
are \4not\0 bags.

\6BAG-DELETE\0 is  an operation which  takes two arguments, x  and B.
Although  x can  be anything, B  must be a  bag. The  procedure is to
remove one occurrence of x from B.

\6BAG-DIFF\0 is an operation which takes two bags  B,C. It repeatedly
picks a member of C,  and removes it (one occurrence of it) from both
B and C. This continues until C is empty.

\6BAG-INSERT\0 is an operation which  adds (another occurrence of)  x
into bag B.

\6BAG-INTERSECT\0 takes  two bags B,C,  and creates a  new bag  D. An
item occurs in  D the \4minimum\0 number of times it occurs in either
B or C.

\6BAG-UNION\0 takes bag C and dumps all its elements into bag B.

\6CANONIZE\0  is  both  an   example  of  and  a  specialization   of
`Operation'.  It accepts two predicates  P1 and P2 as arguments, both
defined over some domain A$\times $A, where P1 is a generalization of P2.
Canonize  then  tries to  produce  a  ``standard  representation"  for
elements of A, in the following way. It creates an operation f from A
into A, satisfying: \it 
P1(x,y) iff  P2(f(x),f(y))\1. Then any item  of
the form  f(x) is called  a canonical  member of A.  The set of  such
canonical-A's  is worth  naming, and  it  is worth  investigating the
restrictions of various operations' domains and ranges to this set of
canonical-A's\foo{ i.e., take an operation which used to have ``A" as one
of  its domain components  or as its  range, and try to  create a new
operation with essentially the same definition but whose domain/range
says ``Canonical-A"  instead of ``A".  }.  ``Canonize"  contains lots of
information relevant to creating such functions f (given P1 and  P2).
Thus Canonize is an  example of the concept Operation.  Canonize also
contains information  relevant to dealing with any  and all such f's.
So Canonize is a specialization of Operation.

\6COALESCE\0 admits  the  same  duality\foo{ Both  a  specialization  of
Operation and an example of Operation. }.  This very useful operation
takes as its argument any operation F(a,b,c,d$\ldots $), locates two domain
components which  intersect  (preferably, which  are  equal; say  the
second  and third), and  then creates  a new  operation G  defined as
G(a,b,d$\ldots $) =$↓{df}$ F(a,b,b,d$\ldots $).
That is, F is called upon with a  pair
of arguments equal to each  other.  If F were Times, then  G would be
Squaring.   If F  were Set-insert, then  G would be  the operation of
inserting a set S into itself.

\6COMPOSITION\0 involves taking two operations A and B,  and applying
them in  sequence: A$\circ$B(x) =$↓{df}$ A(B(x)). This concept  deals with (i)
the   activity  of  creating  new  compositions,   given  a  pair  of
operations;  (ii) all  the  operations  which were  created  in  this
fashion. That is why this concept is both a specialization of and an
example of Operation.

\6CONJECTURES\0 are a kind of object. This concept knows about --- and
can store --- conjectures. When proof techniques are inserted into AM,
this  tiny  twig   of  the  tree  of  concepts  will  grow  to  giant
proportions.

\6CONSTANT-PREDICATE\0 is a predicate which can afford to have a very
liberal domain: it always ignores  its arguments and just returns the
same logical value all the time.

\6DELETE\0 is  an operation which contains all the information common
to all flavors of removing an element from a structure (regardless of
the type of  structure which is being attenuated).   When called upon
to actually perform a deletion,  this concept determines the type  of
structure and then  calls the appropriate specialized  delete concept
(e.g., Bag-delete).

\6DIFFERENCE\0  is  another  general  operation,  which  accepts  two
structures, determines their  type (e.g., Bags),  and then calls  the
appropriate specialized version of difference (e.g., Bag-diff).

\6EMPTY-STRUCTURE\0  contains data  relevant  to  structures with  no
members.

\6FIRST-ELEMENT\0  is an  operation which takes  an ordered structure
and returns the first element. It is like the Lisp function `CAR'.

\6IDENTITY\0 is just what it claims to be.  It takes one argument and
returns it immediately. The main purpose of knowing about this boring
transformation  is  just   in  case  some   new  concept  turns   out
unexpectedly to be equivalent to it.

\6INSERT\0 takes an  item x and  a structure S, determines  S's type,
and  calls the appropriate  flavor of  specialized Insertion concept.
The general INSERT concept contains any information common to  all of
those insertion concepts.

\6INTERSECT\0 is an operation which  computes the intersection of any
two  structures.   It, too, has  a separate  specialization for Bags,
Sets, Osets, and Lists.


\6INVERT-AN-OPERATION\0 is a very active concept.  It  can invert any
given operation.  If F:X$→$Y  is an operation, then its inverse will be
abbreviated F\inv , and  F\inv (y) is  defined as all  the x's  in X  for
which F(x)=y.   The domain and range of  F\inv \  are thus the  range and
domain of F.

\6INVERTED-OP\0  contains  information specific  to  operations which
were created as the inverses of more primitive ones.

\6LAST-ELEMENT\0 takes  an ordered  structure and  returns its  final
member.

\6LIST\0  is a  type  of  structure.   It  is  ordered, and  multiple
occurrences of  the same element are permitted. Lists are also called
vectors, tuples, and obags (for ``ordered bags").

\6LIST-DELETE\0 is an operation  which takes two arguments, x  and B.
Although x  can be anything,  B must be  a list. The  procedure is to
remove the first occurrence of x from B.

\6LIST-DIFF\0 is  an  operation  which  takes  two  lists  B,C.    It
repeatedly picks a member  of C, and removes it  (the first remaining
occurrence of it) from  both B and C.  This continues until there are
no more members in C.

\6LIST-INSERT\0 is an operation which adds (another occurrence  of) x
onto the front of list B. It is like the Lisp function `CONS'.

\6LIST-INTERSECT\0 takes two lists B,C, and  creates a new list D. An
item occurs  in D the \4minimum\0 number of times it occurs in either
B or C.  D is arranged in order as (a sublist of) list B.

\6LIST-UNION\0 takes list C  glues it onto the  end of list B.   It's
like `APPEND' in Lisp.

\6LOGICAL-RELATION\0 contains  knowledge about Boolean  combinations:
disjunction, conjunction, implication, etc.

\6MULTIPLE-ELEMENTS-STRUCTURES\0  are a specialized kind of Struc\-ture.
They permit the same atom to occur more than once as a member. (e.g.,
Bags and Lists)

\6NO-MULTIPLE-ELEMENTS-STRUCTURES\0   are    a   specialization    of
Structure. They permit  the same atom to occur only once as a member.
(e.g., Sets and Osets)

\6NONEMPTY-STRUCTURES\0 are a specialization of Structure also.  They
contain data about all structures which have some members.

\6OBJECT\0  is a  general,  static  concept.   Objects  are like  the
subjects and  direct objects in sentences, while the Actives are like
the verbs\foo{ As in English, a particular Activity can sometimes itself
be the subject. }.

\6OBJECT-EQUALITY\0 is  a predicate. It takes a  pair of objects, and
returns True if (i) they are identical, or (ii) they are  structures,
and each  corresponding  pair of  members satisfies  Object-Equality.
Often we'll call this `Equal', and denote it as `='.

\6OPERATIONS\0 are  Actives which take arguments  and return a value.
While a predicate examines its  arguments and returns either True  or
False, an operation examines \4its\0 arguments and returns any number
of values,  of varying types.  Assuming that the arguments lay in the
domain  of  the  operation  (as  specified  by  some   entry  on  its
Domain/range facet),  then every value  returned must lie  within its
range (as specified by that same Domain/range entry).

\6ORDERED-PAIR\0 is a kind of List. It has just two `slots', however:
a front and a rear element.

\6ORDERED-STRUCTURE\0 is a specialized type of Structure. It includes
all  structures for which the  order of insertion of  two members can
make a  difference  in  whether  the structures  are  equal  or  not.
Ordered-structures are those for which it makes sense to talk about a
front and a rear, a first element and a last element.

\6OSET\0  is  a  type of  structure.   It  is  ordered,  and multiple
occurrences  of   the  same   element   are  not   permitted.     The
short-term-memory of Newell's PSG [Newell 73] is an Oset, as is a cafeteria
line. Not much use was found for this concept by AM.

\6OSET-DELETE\0 removes x from oset B (if x was in B).

\6OSET-DIFF\0 is an operation which  takes two osets B,C. It  removes
each member of C from B.

\6OSET-INSERT\0 is an operation which adds x to  the front of oset B.
If x was in B previously, it is simply moved to the front of B.

\6OSET-INTERSECT\0 takes two  osets B,C, and removes from B any items
which are \4not\0 in C as well. B thus `induces' the ordering  on the
resultant oset.

\6OSET-UNION\0 takes oset C, removes  any elements in B already, then
glues what's left of C onto the rear of B.

\6PARALLEL-JOIN\0 is an operation which takes a kind of structure and
an operation H.  It  creates a new operation F, whose domain  is that
type of  structure.  For  any such structure  S, F(S) is  computed by
appending together H(x) for each member x of S.

\6PARALLEL-JOIN2\0 is a similar operation.  It creates an operation  F
with two  structural arguments. F(S,L)  is computed by  appending the
values  of H(x,L), as  x runs through  the elements  of S.\foo{Here, the
args to PARALLEL-JOIN2 are two types of structures SS and LL,  and an
operation H  whose range  is also  a structural type  DD. Then  a new
operation is created, with domain SS$\times $LL and range DD. }

\6PARALLEL-REPLACE\0   is  an   operation  used   to  synthesize  new
substitution operations.  It takes a structural type and an operation
H as its  arguments, and creates a new operation  F. F(S) is computed
by simply replacing each  member x of  S by the value  of F(x).   The
operation produced is very much like the Lisp function MAPCAR.

\6PARALLEL-REPLACE2\0  is a  slightly  more  general operation.    It
creates  F, where  F(S,L) is  computed by  replacing each  x$\epsilon $S by
F(x,L).

\6PREDICATES\0 are  actives which  examine their  arguments and  then
return  T  or  NIL  (True  or  False).     It  is  only  due  to  the
capriciousness  of  AM's  initial  design  that  predicates  are kept
distinct from operations.   Of course,  each example of an  operation
can be  viewed as if it  were a predicate; if F:A$→$B  is any operation
from A to  B, then we  can consider F  a relation on  A$\times $B, that is  a
subset of A$\times $B, and from there pass to viewing F as a (characteristic)
predicate  F:A$\times $B$→\ \{$T,F$\}$.  Similarly, any  predicate on 
$A\times \ldots \times B \times C$ may
be  considered  an  operation  (a  multi-valued,   not-always-defined
function) from $A\times \ldots \times  B$ into $C$.  
There are  no unary predicates.   If
there were  one, say P:A$→\ \{T,F\}$, then that predicate would essentially
be a new way to view a certain subset of A;  the predicate would then
be transformed  into $\{a \epsilon A\  |\  P(a)\}$, 
made into a  new concept, tagged
as  a  specialization  of  A,  and  its  definition  would  be  
``$\lambda \, (a) \biglp  A.Defn(a) \meet P(a)\bigrp $".

\6PROJECTION1\0 is  a simple operation.  It is defined  as ``$\lambda $  (x y)
x\1."  Notice  that  Identity is  just a  specialized  restriction of
Proj1. Proj1(Me,You)=Me.

\6PROJECTION2\0 is a similar  operation. Proj2(Me,You)=You.

\6RELATION\0 is any Active which has been  encapsulated into a set of
ordered pairs.   `Relation' bridges the gap between active and static
concepts.

\6REPEAT\0 is an operation for generating new operations by repeating
old ones.  Given as its argument a structural type $\Sscr $ and an existing
operation  H  (with   domain  and  range   of  the  form   
$\Sscr \times \Sscr \, \→\,\Sscr$),
Repeat($\Sscr $,H) synthesizes a brand new operation F. The domain/range of
F is just that of H. F(S) is computed by repeating TEMP$←$H(x,TEMP) for
each element x of S.  TEMP is  initialized as some member (preferably
the first element) of S.

\6REPEAT2\0 is similar, but requires that H take three arguments, and
it  creates  a new operation  F,  
where  F(S,L)   is  computable   by  repeatedly   doing
TEMP$←$H(x,TEMP,L).

\6RESTRICT\0  is  an  operation   which  turns  out  new
operations.   Given as an argument some operation (or predicate) F, 
RESTRICT would synthesize a new operation which
would have the same definition as F, but would have its domain and/or
range curtailed.

\6REVERSE-ORDERED-PAIR\0  transforms  the  ordered  pair  $<x,y>$ into
$<y,x>$.

\6SET\0 is  a type  of  structure.   It  is unordered,  and  multiple
occurrences of the same element are not permitted.

\6SET-DELETE\0 is an  operation which takes  two arguments, x  and B.
Although  x can be anything,  B must be  a set.  The  procedure is to
remove x from B (if x was  in B), then return the resultant value  of
B.

\6SET-DIFF\0 is  an operation  which takes two  sets B,C.  It removes
each member of C from B.

\6SET-INSERT\0 is an operation which adds x to set B.

\6SET-INTERSECT\0  removes from set B any  items which are \4not\0 in
set C, too.

\6SET-UNION\0 dumps  into B  all the members  of C  which weren't  in
there already.


\6STRUCTURE\1, the  antithesis of ATOM,  is inherently divisible.   A
structure is  something that  has members,  that can  be broken  into
pieces.   There  are two  questions one  can  ask about  any kind  of
structure: Is it ordered or not? Can there be multiple occurrences of
the same element in  it or not?   There are four  sets of answers  to
these two questions, and each of the four specifies a well-known kind
of structure (Sets, Lists, Osets, Bags).

\6STRUCTURE-OF-STRUCTURES\0 is a specialization of Structure, representing
those structures all of whose \4members\0 are themselves structures.

\6TRUTH-VALUE\0  is a  specialized  kind of  atomic object.  Its only
examples are True and False.   This concept is the range set  for all
predicates.

\6UNION\0  is  a general  kind  of joining  operation.  It takes  two
structures and combines them. Four separate variants of this  concept
are given to AM initially (e.g., Set-union).

\6UNORDERED-STRUCTURE\0  is a  specialized  type  of Structure.    It
includes  all structures  for  which the  order of  insertion  of two
members never  makes any  difference in  whether  the structures  are
equal or not. Unordered-structures cannot be said to have a front or a
rear, a first element or a last element.

 \SSSEC{Rationale behind Choice of Concepts}

A  necessary  part  of  realizing  AM  was  to  choose  a
particular set of starting concepts. But how should such a  choice be
made?

My first impulse  was to gather a \4complete\0 set  of concepts. That
is, a  basis which would be sufficient to derive all mathematics. The
longer I studied  this, the larger the  estimated size of this  basis
grew.  It immediately became clear that
this would never fit in 256k. \foo{ This is the size of the core  memory
of the computer  I had at my  disposal.  } One  philosophical problem
here  is that future mathematics  may be inspired  by some real-world
phenomena which haven't even been observed yet. Aliens visiting Earth
might have a different mathematics  from ours, since their collective
life experiences could be quite different from we Terrans'.

Scrapping the idea of a sufficient basis, what about a necessary one?
That is, a basis which  would be \4minimal\0 in the  following sense:
if  you ever removed  a concept  from that basis,  it could  never be
re-discovered.   In isolated  cases, one  can tell  when a  basis  is
\4not\0 minimal:  if it  contains both  addition and  multiplication,
then  it  is too  rich,  since the  latter  can be  derived  from the
former.\foo{ by AM, and by  any mathematician. As Don Cohen points  out,
if the researcher lacked the  proper discovery methods, then he might
never  derive Times  from  Plus. }  And yet,  the same  problem about
``absoluteness" cropped up: how can anyone claim that the discovery of
X can \4never\0 be made  from a given starting point? Until recently,
mathematicians didn't realize  how natural it  was to derive  numbers
and arithmetic from set  theory (a task which AM does,  by the way)\foo{
The ``new  math" is trying to  get young children to  do this as well;
unfortunately, no  one  showed  the  elementary-school  teachers  the
underlying harmony,  and the results have  been saddening. }.   So 50
years  ago the concepts  of set  theory and number  theory would both
have been undisputedly placed into a ``minimal" basis.  There are thus
no  absolute conceptual primitives;  each culture (perhaps  even each
individual) possesses its own basis.

Since I  couldn't give  AM a  minimal basis,  nor a  complete one,  I
decided AM might as  well have a \4nice\0 one. Although  it can never
be  minimal, it  should  nevertheless be  made very  small  (order of
magnitude: 100  concepts).   Although it  can never  be complete,  it
should suffice for re-discovering  much of already-known mathematics.
Finally, it should be \4rational\1, by which I mean that there should
be a simple rule for  deciding which concepts do and don't  belong in
that basis.

The concepts AM starts with  are meant to be those possessed by young
children (age 4, say). This explains some omissions of concepts which
would otherwise be  considered fundamental: (i) Proof  and techniques
for  proof/disproof;  (ii)  Abstract  properties  of relations,  like
associativity, single-valued,  onto; (iii)  Cardinality,  arithmetic;
(iv) Infinity,  continuity, limits. The interested  reader should see
[Piaget 55] or [Copeland 70].

Because  my programming time and the  PDP-10's memory space were both
quite  small,  only  a  small  percentage  of  these  `pre-numerical'
concepts  could be  included.   Some unjustified  omissions  are: (i)
visual operations,  like  rotation, coloration;  (ii)  Games,  rules,
procedures,  strategies,  tactics;  (iii)  Geometric  notions,  e.g.,
outside and between.

AM is  not supposed to be a model of a  child, however.  It was never
my intention (and it would be much too hard for me) to try to emulate
a human child's  whimsical imagination and emotive drives.  And AM is
not  ripe for  ``teaching", as are  children.\foo{ Learning psychologists
might  label   AM  as   neo-behavioristic   and  cognitivistic.   See
[Le\-Fran\-cois 72].  }  
Also, though it possesses  a child's ignorance of most concepts, AM is given
a large body of sophisticated ``adult" heuristics. So perhaps
a  more faithful  image  is  that  of Ramanujan,  a
brilliant modern mathematician  who received a  very poor  education,
and  was forced  to re-derive  much  of known  number  theory all  by
himself. Incidentally, Ramanujan never did master the concept of formal proof.


There is  no formal justification for the  particular set of starting
concepts.  They are all reasonably primitive (sets, composition), and
lie several  levels ``below" the  ones which AM managed  to ultimately
derive  (prime factorization, square-root).  It  might be valuable to
attempt a similar automated math discoverer, which began  with a very
different set of concepts (e.g., start it out as an expert in lattice
theory, possessing all known concepts thereof).  The converse kind of
experiments are to vary the initial base of concepts, and observe the
effects  on  AM's behavior.    A  few experiments  of  that  form are
described in Section 6.2.